git-commit-vandalism

Author	SHA1	Message	Date
brian m. carlson	aab9583f7b	Convert find_unique_abbrev* to struct object_id Convert find_unique_abbrev and find_unique_abbrev_r to each take a pointer to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:48 -07:00
Junio C Hamano	169c9c0169	Merge branch 'bw/c-plus-plus' Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-03-06 14:54:07 -08:00
Nguyễn Thái Ngọc Duy	ddf88fa616	diff: add --compact-summary Certain information is currently shown with --summary, but when used in combination with --stat it's a bit hard to read since info of the same file is in two places (--stat and --summary). On top of that, commits that add or remove files double the number of display lines, which could be a lot if you add or remove a lot of files. --compact-summary embeds most of --summary back in --stat in the little space between the file name part and the graph line, e.g. with commit `0433d533f1`: Documentation/merge-config.txt \| 4 + builtin/merge.c \| 2 + ...-pull-verify-signatures.sh (new +x) \| 81 ++++++++++++++ t/t7612-merge-verify-signatures.sh \| 45 ++++++++ 4 files changed, 132 insertions(+) It helps both condensing information and saving some text space. What's new in diffstat is: - A new 0644 file is shown as (new) - A new 0755 file is shown as (new +x) - A new symlink is shown as (new +l) - A deleted file is shown as (gone) - A mode change adding executable bit is shown as (mode +x) - A mode change removing it is shown as (mode -x) Note that --compact-summary does not contain all the information --summary provides. Rewrite percentage is not shown but it could be added later, like R50% or C20%. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-27 15:22:47 -08:00
Junio C Hamano	12accdc023	Merge branch 'nd/ita-wt-renames-in-status' into maint "git status" after moving a path in the working tree (hence making it appear "removed") and then adding with the -N option (hence making that appear "added") detected it as a rename, but did not report the old and new pathnames correctly. * nd/ita-wt-renames-in-status: wt-status.c: handle worktree renames wt-status.c: rename rename-related fields in wt_status_change_data wt-status.c: catch unhandled diff status codes wt-status.c: coding style fix Use DIFF_DETECT_RENAME for detect_rename assignments t2203: test status output with porcelain v2 format	2018-02-27 10:39:35 -08:00
Brandon Williams	c2a46a7c1f	diff: rename 'template' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Brandon Williams	63a01c3f79	diff: rename 'new' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Brandon Williams	585c0e2efa	diff: rename 'this' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Junio C Hamano	17c8e0b33d	Merge branch 'nd/diff-flush-before-warning' Avoid showing a warning message in the middle of a line of "git diff" output. * nd/diff-flush-before-warning: diff.c: flush stdout before printing rename warnings	2018-02-13 13:39:09 -08:00
Junio C Hamano	9bc89b17e3	Merge branch 'tb/crlf-conv-flags' Code clean-up. * tb/crlf-conv-flags: convert_to_git(): safe_crlf/checksafe becomes int conv_flags	2018-02-13 13:39:08 -08:00
Nguyễn Thái Ngọc Duy	c905cbc49c	diff.c: refactor pprint_rename() to use strbuf Instead of passing char* around, let function handle strbuf directly. All callers already use strbuf internally. This helps kill the "not free" exception in free_diffstat_info(). I don't think this code is so critical that we need to avoid some free() calls. The other benefit comes in the next patch, where we append something in pname before returning from fill_print_name(). With strbuf, it's very simple. With "char *" we may have to resort to explicit reallocation and stuff. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 12:05:27 -08:00
Junio C Hamano	c0d75f0e2e	Merge branch 'sb/diff-blobfind-pickaxe' "diff" family of commands learned "--find-object=<object-id>" option to limit the findings to changes that involve the named object. * sb/diff-blobfind-pickaxe: diff: use HAS_MULTI_BITS instead of counting bits manually diff: properly error out when combining multiple pickaxe options diffcore: add a pickaxe option to find a specific blob diff: introduce DIFF_PICKAXE_KINDS_MASK diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit diff.h: make pickaxe_opts an unsigned bit field	2018-01-23 13:16:37 -08:00
Junio C Hamano	bc3dca07f4	Merge branch 'nd/ita-wt-renames-in-status' "git status" after moving a path in the working tree (hence making it appear "removed") and then adding with the -N option (hence making that appear "added") detected it as a rename, but did not report the old and new pathnames correctly. * nd/ita-wt-renames-in-status: wt-status.c: handle worktree renames wt-status.c: rename rename-related fields in wt_status_change_data wt-status.c: catch unhandled diff status codes wt-status.c: coding style fix Use DIFF_DETECT_RENAME for detect_rename assignments t2203: test status output with porcelain v2 format	2018-01-23 13:16:28 -08:00
Nguyễn Thái Ngọc Duy	4e056c989f	diff.c: flush stdout before printing rename warnings The diff output is buffered in a FILE object and could still be partially buffered when we print these warnings (directly to fd 2). The output is messed up like this worktree.c \| 138 +- worktree.h warning: inexact rename detection was skipped due to too many files. \| 12 +- wrapper.c \| 83 +- It gets worse if the warning is printed after color codes for the graph part are already printed. You'll get a warning in green or red. Flush stdout first, so we can get something like this instead: xdiff/xutils.c \| 42 +- xdiff/xutils.h \| 4 +- 1033 files changed, 150824 insertions(+), 69395 deletions(-) warning: inexact rename detection was skipped due to too many files. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-16 14:34:20 -08:00
Torsten Bögershausen	8462ff43e4	convert_to_git(): safe_crlf/checksafe becomes int conv_flags When calling convert_to_git(), the checksafe parameter defined what should happen if the EOL conversion (CRLF --> LF --> CRLF) does not roundtrip cleanly. In addition, it also defined if line endings should be renormalized (CRLF --> LF) or kept as they are. checksafe was an safe_crlf enum with these values: SAFE_CRLF_FALSE: do nothing in case of EOL roundtrip errors SAFE_CRLF_FAIL: die in case of EOL roundtrip errors SAFE_CRLF_WARN: print a warning in case of EOL roundtrip errors SAFE_CRLF_RENORMALIZE: change CRLF to LF SAFE_CRLF_KEEP_CRLF: keep all line endings as they are In some cases the integer value 0 was passed as checksafe parameter instead of the correct enum value SAFE_CRLF_FALSE. That was no problem because SAFE_CRLF_FALSE is defined as 0. FALSE/FAIL/WARN are different from RENORMALIZE and KEEP_CRLF. Therefore, an enum is not ideal. Let's use a integer bit pattern instead and rename the parameter to conv_flags to make it more generically usable. This allows us to extend the bit pattern in a subsequent commit. Reported-By: Randall S. Becker <rsbecker@nexbridge.com> Helped-By: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-16 12:35:56 -08:00
Stefan Beller	4d8c51aa19	diff: use HAS_MULTI_BITS instead of counting bits manually This aligns the style to the previous patch. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	5e505257f2	diff: properly error out when combining multiple pickaxe options In `f506b8e8b5` (git log/diff: add -G<regexp> that greps in the patch text, 2010-08-23) we were hesitant to check if the user requests both -S and -G at the same time. Now that the pickaxe family also offers --find-object, which looks slightly more different than the former two, let's add a check that those are not used at the same time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	15af58c1ad	diffcore: add a pickaxe option to find a specific blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) One might be tempted to extend git-describe to also work with blobs, such that `git describe <blob-id>` gives a description as '<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer number of responses (>110), it turns out this is tricky to get right. The hard part to get right is picking the correct 'commit-ish' as that could be the commit that (re-)introduced the blob or the blob that removed the blob; the blob could exist in different branches. Junio hinted at a different approach of solving this problem, which this patch implements. Teach the diff machinery another flag for restricting the information to what is shown. For example: $ ./git log --oneline --find-object=v2.0.0:Makefile `b2feb64309` Revert the whole "ask curl-config" topic for now `47fbfded53` i18n: only extract comments marked with "TRANSLATORS:" we observe that the Makefile as shipped with 2.0 was appeared in v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The reason why these commits both occur prior to v2.0.0 are evil merges that are not found using this new mechanism. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	cf63051ada	diff: introduce DIFF_PICKAXE_KINDS_MASK Currently the check whether to perform pickaxing is done via checking `diffopt->pickaxe`, which contains the command line argument that we want to pickaxe for. Soon we'll introduce a new type of pickaxing, that will not store anything in the `.pickaxe` field, so let's migrate the check to be dependent on pickaxe_opts. It is not enough to just replace the check for pickaxe by pickaxe_opts, because flags might be set, but pickaxing was not requested ('-i'). To cope with that, introduce a mask to check only for the bits indicating the modes of operation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Junio C Hamano	f427b94985	Merge branch 'cc/skip-to-optional-val' Introduce a helper to simplify code to parse a common pattern that expects either "--key" or "--key=<something>". * cc/skip-to-optional-val: t4045: reindent to make helpers readable diff: add tests for --relative without optional prefix value diff: use skip_to_optional_arg_default() in parsing --relative diff: use skip_to_optional_arg_default() diff: use skip_to_optional_arg() index-pack: use skip_to_optional_arg() git-compat-util: introduce skip_to_optional_arg()	2017-12-28 14:08:46 -08:00
Nguyễn Thái Ngọc Duy	06dba2b023	Use DIFF_DETECT_RENAME for detect_rename assignments This field can have two values (2 for copy). Use this name instead for clarity. Many places have already used this constant. Note, the detect_rename assignments in merge-recursive.c remain unchanged because it's actually a boolean there. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-27 12:38:35 -08:00
Junio C Hamano	8d7fefaac4	Merge branch 'ar/unconfuse-three-dots' Ancient part of codebase still shows dots after an abbreviated object name just to show that it is not a full object name, but these ellipses are confusing to people who newly discovered Git who are used to seeing abbreviated object names and find them confusing with the range syntax. * ar/unconfuse-three-dots: t2020: test variations that matter t4013: test new output from diff --abbrev --raw diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value t4013: prepare for upcoming "diff --raw --abbrev" output format change checkout: describe_detached_head: remove ellipsis after committish print_sha1_ellipsis: introduce helper Documentation: user-manual: limit usage of ellipsis Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").	2017-12-19 11:33:58 -08:00
Junio C Hamano	d7c6c2369a	Merge branch 'jt/diff-anchored-patience' "git diff" learned a variant of the "--patience" algorithm, to which the user can specify which 'unique' line to be used as anchoring points. * jt/diff-anchored-patience: diff: support anchoring line(s)	2017-12-19 11:33:56 -08:00
Junio C Hamano	646685460c	Merge branch 'en/rename-progress' Historically, the diff machinery for rename detection had a hardcoded limit of 32k paths; this is being lifted to allow users trade cycles with a (possibly) easier to read result. * en/rename-progress: diffcore-rename: make diff-tree -l0 mean -l<large> sequencer: show rename progress during cherry picks diff: remove silent clamp of renameLimit progress: fix progress meters when dealing with lots of work sequencer: warn when internal merge may be suboptimal due to renameLimit	2017-12-19 11:33:55 -08:00
Junio C Hamano	1efad51197	diff: use skip_to_optional_arg_default() in parsing --relative Helped-by: Jacob Keller <jacob.keller@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-11 16:10:12 -08:00
Christian Couder	cf81f94da4	diff: use skip_to_optional_arg_default() Let's simplify diff option parsing using skip_to_optional_arg_default(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-11 16:10:12 -08:00
Christian Couder	948cbe6703	diff: use skip_to_optional_arg() Let's simplify diff option parsing using skip_to_optional_arg(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-11 16:10:12 -08:00
Ann T Ropea	7cb6ac1e4b	diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value Neither Git nor the user are in need of this (visual) aid anymore, but we must offer a transition period. A follow-up patch (series) will rectify the situation by covering the new output format as well as the backward compatible one. Also, fix a typo: "abbbreviated" ---> "abbreviated". Signed-off-by: Ann T Ropea <bedhanger@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-06 07:32:59 -08:00
Jonathan Tan	2477ab2ea8	diff: support anchoring line(s) Teach diff a new algorithm, one that attempts to prevent user-specified lines from appearing as a deletion or addition in the end result. The end user can use this by specifying "--anchored=<text>" one or more times when using Git commands like "diff" and "show". Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-28 10:40:04 +09:00
Junio C Hamano	10f65c239a	Merge branch 'jc/ignore-cr-at-eol' The "diff" family of commands learned to ignore differences in carriage return at the end of line. * jc/ignore-cr-at-eol: diff: --ignore-cr-at-eol xdiff: reassign xpparm_t.flags bits	2017-11-27 11:06:31 +09:00
Elijah Newren	9f7e4bfa3b	diff: remove silent clamp of renameLimit In commit `0024a5492` (Fix the rename detection limit checking; 2007-09-14), the renameLimit was clamped to 32767. This appears to have been to simply avoid integer overflow in the following computation: num_create * num_src <= rename_limit * rename_limit although it also could be viewed as a hardcoded bound on the amount of CPU time we're willing to allow users to tell git to spend on handling renames. An upper bound may make sense, but unfortunately this upper bound was neither communicated to the users, nor documented anywhere. Although large limits can make things slow, we have users who would be ecstatic to have a small five file change be correctly cherry picked even if they have to manually specify a large limit and wait ten minutes for the renames to be detected. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-15 13:11:25 +09:00
Junio C Hamano	8cc633286a	Merge branch 'bw/diff-opt-impl-to-bitfields' A single-word "unsigned flags" in the diff options is being split into a structure with many bitfields. * bw/diff-opt-impl-to-bitfields: diff: make struct diff_flags members lowercase diff: remove DIFF_OPT_CLR macro diff: remove DIFF_OPT_SET macro diff: remove DIFF_OPT_TST macro diff: remove touched flags diff: add flag to indicate textconv was set via cmdline diff: convert flags to be stored in bitfields add, reset: use DIFF_OPT_SET macro to set a diff flag	2017-11-09 14:31:27 +09:00
Junio C Hamano	4e9762ed47	Merge branch 'ao/diff-populate-filespec-lstat-errorpath-fix' After an error from lstat(), diff_populate_filespec() function sometimes still went ahead and used invalid data in struct stat, which has been fixed. * ao/diff-populate-filespec-lstat-errorpath-fix: diff: fix lstat() error handling in diff_populate_filespec()	2017-11-09 14:31:26 +09:00
Junio C Hamano	e9282f02b2	diff: --ignore-cr-at-eol A new option --ignore-cr-at-eol tells the diff machinery to treat a carriage-return at the end of a (complete) line as if it does not exist. Just like other "--ignore-*" options to ignore various kinds of whitespace differences, this will help reviewing the real changes you made without getting distracted by spurious CRLF<->LF conversion made by your editor program. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> [jch: squashed in command line completion by Dscho] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-08 10:05:27 +09:00
Brandon Williams	0d1e0e7801	diff: make struct diff_flags members lowercase Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:40 +09:00
Brandon Williams	b2100e5291	diff: remove DIFF_OPT_CLR macro Remove the `DIFF_OPT_CLR` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_CLR(&E, fld) + E.flags.fld = 0 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_CLR(ptr, fld) + ptr->flags.fld = 0 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:30 +09:00
Brandon Williams	23dcf77f48	diff: remove DIFF_OPT_SET macro Remove the `DIFF_OPT_SET` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_SET(&E, fld) + E.flags.fld = 1 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_SET(ptr, fld) + ptr->flags.fld = 1 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Brandon Williams	3b69daed86	diff: remove DIFF_OPT_TST macro Remove the `DIFF_OPT_TST` macro and instead access the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_TST(&E, fld) + E.flags.fld @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_TST(ptr, fld) + ptr->flags.fld Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Brandon Williams	afa73c5384	diff: add flag to indicate textconv was set via cmdline git-show is unique in that it wants to use textconv by default except for when it is showing blobs. When asked to show a blob, show doesn't want to use textconv unless the user explicitly requested that it be used by providing the command line flag '--textconv'. Currently this is done by using a parallel set of 'touched' flags which get set every time a particular flag is set or cleared. In a future patch we want to eliminate this parallel set of flags so instead of relying on if the textconv flag has been touched, add a new flag 'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is set to true via the command line. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:02 +09:00
Brandon Williams	02f2f56bc3	diff: convert flags to be stored in bitfields We cannot add many more flags to the diff machinery due to the limitations of the number of flags that can be stored in a single unsigned int. In order to allow for more flags to be added to the diff machinery in the future this patch converts the flags to be stored in bitfields in 'struct diff_flags'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:02 +09:00
Andrey Okoshkin	10e0ca843d	diff: fix lstat() error handling in diff_populate_filespec() Add lstat() error handling not only for ENOENT case. Otherwise uninitialised 'struct stat st' variable is used later in case of lstat() non-ENOENT failure which leads to processing of rubbish values of file mode ('S_ISLNK(st.st_mode)' check) or size ('xsize_t(st.st_size)'). Signed-off-by: Andrey Okoshkin <a.okoshkin@samsung.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-29 10:16:36 +09:00
Junio C Hamano	446d12cb3f	xdiff: reassign xpparm_t.flags bits We have packed the bits too tightly in such a way that it is not easy to add a new type of whitespace ignoring option, a new type of LCS algorithm, or a new type of post-cleanup heuristics. Reorder bits a bit to give room for these three classes of options to grow. Also make use of XDF_WHITESPACE_FLAGS macro where we check any of these bits are on, instead of using DIFF_XDL_TST() macro on individual possibilities. That way, the "is any of the bits on?" code does not have to change when we add more ways to ignore whitespaces. While at it, add a comment in front of the bit definitions to clarify in which structure these defined bits may appear. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-27 15:57:30 +09:00
Stefan Beller	01be97c2b2	diff.c: get rid of duplicate implementation The implementations in diff.c to detect moved lines needs to compare strings and hash strings, which is implemented in that file, as well as in the xdiff library. Remove the rather recent implementation in diff.c and rely on the well exercised code in the xdiff lib. With this change the hash used for bucketing the strings for the moved line detection changes from FNV32 (that is provided via the hashmaps memhash) to DJB2 (which is used internally in xdiff). Benchmarks found on the web[1] do not indicate that these hashes are different in performance for readable strings. [1] https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-26 11:23:32 +09:00
Jeff King	b66b507292	diff: handle NULs in get_string_hash() For computing moved lines, we feed the characters of each line into a hash. When we've been asked to ignore whitespace, then we pick each character using next_byte(), which returns -1 on end-of-string, which it determines using the start/end pointers we feed it. However our check of its return value treats "0" the same as "-1", meaning we'd quit if the string has an embedded NUL. This is unlikely to ever come up in practice since our line boundaries generally come from calling strlen() in the first place. But it was a bit surprising to me as a reader of the next_byte() code. And it's possible that we may one day feed this function with more exotic input, which otherwise works with arbitrary ptr/len pairs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-21 21:12:53 +09:00
Jeff King	da58318e76	diff: fix whitespace-skipping with --color-moved The code for handling whitespace with --color-moved represents partial strings as a pair of pointers. There are two possible conventions for the end pointer: 1. It points to the byte right after the end of the string. 2. It points to the final byte of the string. But we seem to use both conventions in the code: a. we assign the initial pointers from the NUL-terminated string using (1) b. we eat trailing whitespace by checking the second pointer for isspace(), which needs (2) c. the next_byte() function checks for end-of-string with "if (cp > endp)", which is (2) d. in next_byte() we skip past internal whitespace with "while (cp < end)", which is (1) This creates fewer bugs than you might think, because there are some subtle interactions. Because of (a) and (c), we always return the NUL-terminator from next_byte(). But all of the callers of next_byte() happen to handle that gracefully. Because of the mismatch between (d) and (c), next_byte() could accidentally return a whitespace character right at endp. But because of the interaction of (a) and (b), we fail to actually chomp trailing whitespace, meaning our endp _always_ points to a NUL, canceling out the problem. But that does leave (b) as a real bug: when ignoring whitespace only at the end-of-line, we don't correctly trim it, and fail to match up lines. We can fix the whole thing by moving consistently to one convention. Since convention (1) is idiomatic in our code base, we'll pick that one. The existing "-w" and "-b" tests continue to pass, and a new "--ignore-space-at-eol" shows off the breakage we're fixing. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-21 21:12:35 +09:00
Junio C Hamano	1c0b983a77	Merge branch 'jk/ref-filter-colors-fix' This is the "theoretically more correct" approach of simply stepping back to the state before plumbing commands started paying attention to "color.ui" configuration variable. Let's run with this one. * jk/ref-filter-colors-fix: tag: respect color.ui config Revert "color: check color.ui in git_default_config()" Revert "t6006: drop "always" color config tests" Revert "color: make "always" the same as "auto" in config"	2017-10-18 10:19:08 +09:00
Jeff King	33c643bb08	Revert "color: check color.ui in git_default_config()" This reverts commit `136c8c8b8f`. That commit was trying to address a bug caused by `4c7f1819b3` (make color.ui default to 'auto', 2013-06-10), in which plumbing like diff-tree defaulted to "auto" color, but did not respect a "color.ui" directive to disable it. But it also meant that we started respecting "color.ui" set to "always". This was a known problem, but `4c7f1819b3` argued that nobody ought to be doing that. However, that turned out to be wrong, and we got a number of bug reports related to "add -p" regressing in v2.14.2. Let's revert `136c8c8b8`, fixing the regression to "add -p". This leaves the problem from `4c7f1819b3` unfixed, but: 1. It's a pretty obscure problem in the first place. I only noticed it while working on the color code, and we haven't got a single bug report or complaint about it. 2. We can make a more moderate fix on top by respecting "never" but not "always" for plumbing commands. This is just the minimal fix to go back to the working state we had before v2.14.2. Note that this isn't a pure revert. We now have a test in t3701 which shows off the "add -p" regression. This can be flipped to success. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-17 15:09:52 +09:00
Junio C Hamano	91ccfb8517	Merge branch 'sb/diff-color-move' A recently added "--color-moved" feature of "diff" fell into infinite loop when ignoring whitespace changes, which has been fixed. * sb/diff-color-move: diff: fix infinite loop with --color-moved --ignore-space-change	2017-10-17 13:29:19 +09:00
Jeff King	fa5ba2c1dd	diff: fix infinite loop with --color-moved --ignore-space-change The --color-moved code uses next_byte() to advance through the blob contents. When the user has asked to ignore whitespace changes, we try to collapse any whitespace change down to a single space. However, we enter the conditional block whenever we see the IGNORE_WHITESPACE_CHANGE flag, even if the next byte isn't whitespace. This means that the combination of "--color-moved and --ignore-space-change" was completely broken. Worse, because we return from next_byte() without having advanced our pointer, the function makes no forward progress in the buffer and loops infinitely. Fix this by entering the conditional only when we actually see whitespace. We can apply this also to the IGNORE_WHITESPACE change. That code path isn't buggy (because it falls through to returning the next non-whitespace byte), but it makes the logic more clear if we only bother to look at whitespace flags after seeing that the next byte is whitespace. Reported-by: Orgad Shaneh <orgads@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-16 11:57:45 +09:00
Junio C Hamano	98c57ea6f0	Merge branch 'sb/diff-color-move' The output from "git diff --summary" was broken in a recent topic that has been merged to 'master' and lost a LF after reporting of mode change. This has been fixed. * sb/diff-color-move: diff: correct newline in summary for renamed files	2017-10-03 15:42:49 +09:00
Junio C Hamano	14a8168e2f	Merge branch 'rj/no-sign-compare' Many codepaths have been updated to squelch -Wsign-compare warnings. * rj/no-sign-compare: ALLOC_GROW: avoid -Wsign-compare warnings cache.h: hex2chr() - avoid -Wsign-compare warnings commit-slab.h: avoid -Wsign-compare warnings git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings	2017-09-29 11:23:42 +09:00
Stefan Beller	58aaced444	diff: correct newline in summary for renamed files In `146fdb0dfe` (diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY, 2017-06-29), the conversion from direct printing to the symbol emission dropped the new line character for renamed, copied and rewritten files. Add the emission of a newline, add a test for this case. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-28 13:15:59 +09:00
Junio C Hamano	c50424a6f0	Merge branch 'jk/write-in-full-fix' Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern	2017-09-25 15:24:06 +09:00
Ramsay Jones	071bcaab64	ALLOC_GROW: avoid -Wsign-compare warnings Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-22 13:21:11 +09:00
Junio C Hamano	d811ba1897	Merge branch 'rs/strbuf-leakfix' Many leaks of strbuf have been fixed. * rs/strbuf-leakfix: (34 commits) wt-status: release strbuf after use in wt_longstatus_print_tracking() wt-status: release strbuf after use in read_rebase_todolist() vcs-svn: release strbuf after use in end_revision() utf8: release strbuf on error return in strbuf_utf8_replace() userdiff: release strbuf after use in userdiff_get_textconv() transport-helper: release strbuf after use in process_connect_service() sequencer: release strbuf after use in save_head() shortlog: release strbuf after use in insert_one_record() sha1_file: release strbuf on error return in index_path() send-pack: release strbuf on error return in send_pack() remote: release strbuf after use in set_url() remote: release strbuf after use in migrate_file() remote: release strbuf after use in read_remote_branches() refs: release strbuf on error return in write_pseudoref() notes: release strbuf after use in notes_copy_from_stdin() merge: release strbuf after use in write_merge_heads() merge: release strbuf after use in save_state() mailinfo: release strbuf on error return in handle_boundary() mailinfo: release strbuf after use in handle_from() help: release strbuf on error return in exec_woman_emacs() ...	2017-09-19 10:47:57 +09:00
Jeff King	06f46f237a	avoid "write_in_full(fd, buf, len) != len" pattern The return value of write_in_full() is either "-1", or the requested number of bytes[1]. If we make a partial write before seeing an error, we still return -1, not a partial value. This goes back to `f6aa66cb95` (write_in_full: really write in full or return error on disk full., 2007-01-11). So checking anything except "was the return value negative" is pointless. And there are a couple of reasons not to do so: 1. It can do a funny signed/unsigned comparison. If your "len" is signed (e.g., a size_t) then the compiler will promote the "-1" to its unsigned variant. This works out for "!= len" (unless you really were trying to write the maximum size_t bytes), but is a bug if you check "< len" (an example of which was fixed recently in config.c). We should avoid promoting the mental model that you need to check the length at all, so that new sites are not tempted to copy us. 2. Checking for a negative value is shorter to type, especially when the length is an expression. 3. Linus says so. In `d34cf19b89` (Clean up write_in_full() users, 2007-01-11), right after the write_in_full() semantics were changed, he wrote: I really wish every "write_in_full()" user would just check against "<0" now, but this fixes the nasty and stupid ones. Appeals to authority aside, this makes it clear that writing it this way does not have an intentional benefit. It's a historical curiosity that we never bothered to clean up (and which was undoubtedly cargo-culted into new sites). So let's convert these obviously-correct cases (this includes write_str_in_full(), which is just a wrapper for write_in_full()). [1] A careful reader may notice there is one way that write_in_full() can return a different value. If we ask write() to write N bytes and get a return value that is _larger_ than N, we could return a larger total. But besides the fact that this would imply a totally broken version of write(), it would already invoke undefined behavior. Our internal remaining counter is an unsigned size_t, which means that subtracting too many byte will wrap it around to a very large number. So we'll instantly begin reading off the end of the buffer, trying to write gigabytes (or petabytes) of data. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-14 15:17:59 +09:00
Rene Scharfe	5a612017eb	diff: release strbuf after use in show_stats() Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-07 08:49:27 +09:00
Rene Scharfe	348eda249e	diff: release strbuf after use in show_rename_copy() Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-07 08:49:27 +09:00
Rene Scharfe	fa842d843d	diff: release strbuf after use in diff_summary() Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-07 08:49:27 +09:00
Jeff King	076aa2cbda	tempfile: auto-allocate tempfiles on heap The previous commit taught the tempfile code to give up ownership over tempfiles that have been renamed or deleted. That makes it possible to use a stack variable like this: struct tempfile t; create_tempfile(&t, ...); ... if (!err) rename_tempfile(&t, ...); else delete_tempfile(&t); But doing it this way has a high potential for creating memory errors. The tempfile we pass to create_tempfile() ends up on a global linked list, and it's not safe for it to go out of scope until we've called one of those two deactivation functions. Imagine that we add an early return from the function that forgets to call delete_tempfile(). With a static or heap tempfile variable, the worst case is that the tempfile hangs around until the program exits (and some functions like setup_shallow_temporary rely on this intentionally, creating a tempfile and then leaving it for later cleanup). But with a stack variable as above, this is a serious memory error: the variable goes out of scope and may be filled with garbage by the time the tempfile code looks at it. Let's see if we can make it harder to get this wrong. Since many callers need to allocate arbitrary numbers of tempfiles, we can't rely on static storage as a general solution. So we need to turn to the heap. We could just ask all callers to pass us a heap variable, but that puts the burden on them to call free() at the right time. Instead, let's have the tempfile code handle the heap allocation _and_ the deallocation (when the tempfile is deactivated and removed from the list). This changes the return value of all of the creation functions. For the cleanup functions (delete and rename), we'll add one extra bit of safety: instead of taking a tempfile pointer, we'll take a pointer-to-pointer and set it to NULL after freeing the object. This makes it safe to double-call functions like delete_tempfile(), as the second call treats the NULL input as a noop. Several callsites follow this pattern. The resulting patch does have a fair bit of noise, as each caller needs to be converted to handle: 1. Storing a pointer instead of the struct itself. 2. Passing the pointer instead of taking the struct address. 3. Handling a "struct tempfile " return instead of a file descriptor. We could play games to make this less noisy. For example, by defining the tempfile like this: struct tempfile { struct heap_allocated_part_of_tempfile { int fd; ...etc } actual_data; } Callers would continue to have a "struct tempfile", and it would be "active" only when the inner pointer was non-NULL. But that just makes things more awkward in the long run. There aren't that many callers, so we can simply bite the bullet and adjust all of them. And the compiler makes it easy for us to find them all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-06 17:19:54 +09:00
Jeff King	49bd0fc222	tempfile: do not delete tempfile on failed close When close_tempfile() fails, we delete the tempfile and reset the fields of the tempfile struct. This makes it easier for callers to return without cleaning up, but it also makes this common pattern: if (close_tempfile(tempfile)) return error_errno("error closing %s", tempfile->filename.buf); wrong, because the "filename" field has been reset after the failed close. And it's not easy to fix, as in many cases we don't have another copy of the filename (e.g., if it was created via one of the mks_tempfile functions, and we just have the original template string). Let's drop the feature that a failed close automatically deletes the file. This puts the burden on the caller to do the deletion themselves, but this isn't that big a deal. Callers which do: if (write(...) \|\| close_tempfile(...)) { delete_tempfile(...); return -1; } already had to call delete when the write() failed, and so aren't affected. Likewise, any caller which just calls die() in the error path is OK; we'll delete the tempfile during the atexit handler. Because this patch changes the semantics of close_tempfile() without changing its signature, all callers need to be manually checked and converted to the new scheme. This patch covers all in-tree callers, but there may be others for not-yet-merged topics. To catch these, we rename the function to close_tempfile_gently(), which will attract compile-time attention to new callers. (Technically the original could be considered "gentle" already in that it didn't die() on errors, but this one is even more so). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-06 17:19:53 +09:00
Jeff King	45c6b1ed24	always check return value of close_tempfile If close_tempfile() encounters an error, then it deletes the tempfile and resets the "struct tempfile". But many code paths ignore the return value and continue to use the tempfile. Instead, we should generally treat this the same as a write() error. Note that in the postimage of some of these cases our error message will be bogus after a failed close because we look at tempfile->filename (either directly or via get_tempfile_path). But after the failed close resets the tempfile object, this is guaranteed to be the empty string. That will be addressed in a future patch (because there are many more cases of the same problem than just these instances). Note also in the hunk in gpg-interface.c that it's fine to call delete_tempfile() in the error path, even if close_tempfile() failed and already deleted the file. The tempfile code is smart enough to know the second deletion is a noop. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-06 17:19:53 +09:00
Junio C Hamano	eabdcd4ab4	Merge branch 'jt/packmigrate' Code movement to make it easier to hack later. * jt/packmigrate: (23 commits) pack: move for_each_packed_object() pack: move has_pack_index() pack: move has_sha1_pack() pack: move find_pack_entry() and make it global pack: move find_sha1_pack() pack: move find_pack_entry_one(), is_pack_valid() pack: move check_pack_index_ptr(), nth_packed_object_offset() pack: move nth_packed_object_{sha1,oid} pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() pack: move unpack_object_header() pack: move get_size_from_delta() pack: move unpack_object_header_buffer() pack: move {,re}prepare_packed_git and approximate_object_count pack: move install_packed_git() pack: move add_packed_git() pack: move unuse_pack() pack: move use_pack() pack: move pack-closing functions pack: move release_pack_memory() pack: move open_pack_index(), parse_pack_index() ...	2017-08-26 22:55:09 -07:00
Junio C Hamano	614ea03a71	Merge branch 'bw/submodule-config-cleanup' Code clean-up to avoid mixing values read from the .gitmodules file and values read from the .git/config file. * bw/submodule-config-cleanup: submodule: remove gitmodules_config unpack-trees: improve loading of .gitmodules submodule-config: lazy-load a repository's .gitmodules file submodule-config: move submodule-config functions to submodule-config.c submodule-config: remove support for overlaying repository config diff: stop allowing diff to have submodules configured in .git/config submodule: remove submodule_config callback routine unpack-trees: don't respect submodule.update submodule: don't rely on overlayed config when setting diffopts fetch: don't overlay config with submodule-config submodule--helper: don't overlay config in update-clone submodule--helper: don't overlay config in remote_submodule_branch add, reset: ensure submodules can be added or reset submodule: don't use submodule_from_name t7411: check configuration parsing errors	2017-08-26 22:55:08 -07:00
Junio C Hamano	6b8aa3294e	Merge branch 'po/object-id' * po/object-id: sha1_file: convert index_stream to struct object_id sha1_file: convert hash_sha1_file_literally to struct object_id sha1_file: convert index_fd to struct object_id sha1_file: convert index_path to struct object_id read-cache: convert to struct object_id builtin/hash-object: convert to struct object_id	2017-08-26 22:55:07 -07:00
Junio C Hamano	0b96358479	Merge branch 'jt/diff-color-move-fix' A handful of bugfixes and an improvement to "diff --color-moved". * jt/diff-color-move-fix: diff: define block by number of alphanumeric chars diff: respect MIN_BLOCK_LENGTH for last block diff: avoid redundantly clearing a flag	2017-08-26 22:55:04 -07:00
Junio C Hamano	b6c4058f97	Merge branch 'sb/diff-color-move' "git diff" has been taught to optionally paint new lines that are the same as deleted lines elsewhere differently from genuinely new lines. * sb/diff-color-move: (25 commits) diff: document the new --color-moved setting diff.c: add dimming to moved line detection diff.c: color moved lines differently, plain mode diff.c: color moved lines differently diff.c: buffer all output if asked to diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP diff.c: convert word diffing to use emit_diff_symbol diff.c: convert show_stats to use emit_diff_symbol diff.c: convert emit_binary_diff_body to use emit_diff_symbol submodule.c: migrate diff output to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS} diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN] diff.c: migrate emit_line_checked to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO ...	2017-08-26 22:55:03 -07:00
Jonathan Tan	150e3001d0	pack: move has_sha1_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Patryk Obara	98e019b067	sha1_file: convert index_path to struct object_id Convert all remaining callers as well. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-20 21:51:38 -07:00
Junio C Hamano	08a8509e50	diff: retire sane_truncate_fn Long time ago, `23707811` ("diff: do not chomp hunk-header in the middle of a character", 2008-01-02) introduced sane_truncate_line() helper function to trim the "function header" line that is shown at the end of the hunk header line, in order to avoid chomping it in the middle of a single UTF-8 character. It also added a facility to define a custom callback function to make it possible to extend it to non UTF-8 encodings. During the following 8 1/2 years, nobody found need for this custom callback facility. A custom callback function is a wrong design to use here anyway---if your contents need support for non UTF-8 encoding, you shouldn't have to write a custom function and recompile Git to plumb it in. A better approach would be to extend sane_truncate_line() function and have a new member in emit_callback to conditionally trigger it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-17 14:38:22 -07:00
Jonathan Tan	f0b8fb6e59	diff: define block by number of alphanumeric chars The existing behavior of diff --color-moved=zebra does not define the minimum size of a block at all, instead relying on a heuristic applied later to filter out sets of adjacent moved lines that are shorter than 3 lines long. This can be confusing, because a block could thus be colored as moved at the source but not at the destination (or vice versa), depending on its neighbors. Instead, teach diff that the minimum size of a block is 20 alphanumeric characters, the same heuristic used by "git blame". This allows diff to still exclude uninteresting lines appearing on their own (such as those solely consisting of one or a few closing braces), as was the intention of the adjacent-moved-line heuristic. This requires a change in some tests in that some of their lines are no longer considered to be part of a block, because they are too short. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-16 11:44:00 -07:00
Jonathan Tan	09153277f8	diff: respect MIN_BLOCK_LENGTH for last block Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line that does not belong to the current block. In particular, this means that MIN_BLOCK_LENGTH is not checked after all lines are encountered. Perform that check. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-16 11:44:00 -07:00
Jonathan Tan	23b65f9528	diff: avoid redundantly clearing a flag No code in diff.c sets DIFF_SYMBOL_MOVED_LINE except in mark_color_as_moved(), so it is redundant to clear it for the current line. Therefore, clear it only for previous lines. This makes a refactoring in a subsequent patch easier. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-14 12:28:36 -07:00
Brandon Williams	078b75e99b	diff: stop allowing diff to have submodules configured in .git/config Traditionally a submodule is comprised of a gitlink as well as a corresponding entry in the .gitmodules file. Diff doesn't follow this paradigm as its config callback routine falls back to populating the submodule-config if a config entry starts with 'submodule.'. Remove this behavior in order to be consistent with how the submodule-config is populated, via calling 'gitmodules_config()' or 'repo_read_gitmodules()'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-03 13:11:01 -07:00
Jeff King	136c8c8b8f	color: check color.ui in git_default_config() Back in prehistoric times, our decision on whether or not to show color by default relied on using a config callback that either did or didn't load color config like color.diff. When we introduced color.ui, we put it in the same boat: commands had to manually respect it by using git_color_config() or its git_color_default_config() convenience wrapper. But in `4c7f1819b` (make color.ui default to 'auto', 2013-06-10), that changed. Since then, we default color.ui to auto in all programs, meaning that even plumbing commands like "git diff-tree --pretty" might colorize the output. Nobody seems to have complained in the intervening years, presumably because the "is stdout a tty" check does a good job of catching the right cases. But that leaves an interesting curiosity: color.ui defaults to auto even in plumbing, but you can't actually _disable_ the color via config. So if you really hate color and set "color.ui" to false, diff-tree will still show color (but porcelain like git-diff won't). Nobody noticed that either, probably because very few people disable color. One could argue that the plumbing should _always_ disable color unless an explicit --color option is given on the command line. But in practice, this creates a lot of complications for scripts which do want plumbing to show user-visible output. They can't just pass "--color" blindly; they need to check the user's config and decide what to send. Given that nobody has complained about the current behavior, let's assume it's a good path, and follow it to its conclusion: supporting color.ui everywhere. Note that you can create havoc by setting color.ui=always in your config, but that's more or less already the case. We could disallow it entirely, but it is handy for one-offs like: git -c color.ui=always foo >not-a-tty when "foo" does not take a --color option itself. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-13 12:42:51 -07:00
Junio C Hamano	f056cde60e	Merge branch 'rs/use-div-round-up' Code cleanup. * rs/use-div-round-up: use DIV_ROUND_UP	2017-07-12 15:18:23 -07:00
René Scharfe	42c78a216e	use DIV_ROUND_UP Convert code that divides and rounds up to use DIV_ROUND_UP to make the intent clearer and reduce the number of magic constants. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-10 14:24:36 -07:00
Stefan Beller	86b452e276	diff.c: add dimming to moved line detection Any lines inside a moved block of code are not interesting. Boundaries of blocks are only interesting if they are next to another block of moved code. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	176841f0c9	diff.c: color moved lines differently, plain mode Add the 'plain' mode for move detection of code. This omits the checking for adjacent blocks, so it is not as useful. If you have a lot of the same blocks moved in the same patch, the 'Zebra' would end up slow as it is O(n^2) (n is number of same blocks). So this may be useful there and is generally easy to add. Instead be very literal at the move detection, do not skip over short blocks here. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	2e2d5ac184	diff.c: color moved lines differently When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] - if (!is_authorized_user()) [OM] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OM] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NM] + sensitive_stuff(spanning, [NM] + multiple, [NM] + lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] - if (!is_authorized_user()) [OMA] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OMA] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] + sensitive_stuff(spanning, [NMA] + multiple, [NMA] + lines); [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. It is useful to have moved lines colored, but there are annoying corner cases, such as a single line moved, that is very common. For example in a typical patch of C code, we have closing braces that end statement blocks or functions. While it is technically true that these lines are moved as they show up elsewhere, it is harmful for the review as the reviewers attention is drawn to such a minor side annoyance. For now let's have a simple solution of hardcoding the number of moved lines to be at least 3 before coloring them. Note, that the length is applied across all blocks to find the 'lonely' blocks that pollute new code, but do not interfere with a permutated block where each permutation has less lines than 3. Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	e6e045f803	diff.c: buffer all output if asked to Introduce a new option 'emitted_symbols' in the struct diff_options which controls whether all output is buffered up until all output is available. It is set internally in diff.c when necessary. We'll have a new struct 'emitted_string' in diff.c which will be used to buffer each line. The emitted_string will duplicate the memory of the line to buffer as that is easiest to reason about for now. In a future patch we may want to decrease the memory usage by not duplicating all output for buffering but rather we may want to store offsets into the file or in case of hunk descriptions such as the similarity score, we could just store the relevant number and reproduce the text later on. This approach was chosen as a first step because it is quite simple compared to the alternative with less memory footprint. emit_diff_symbol factors out the emission part and depending on the diff_options->emitted_symbols the emission will be performed directly when calling emit_diff_symbol or after the whole process is done, i.e. by buffering we have add the possibility for a second pass over the whole output before doing the actual output. In `6440d34` (2012-03-14, diff: tweak a _copy_ of diff_options with word-diff) we introduced a duplicate diff options struct for word emissions as we may have different regex settings in there. When buffering the output, we need to operate on just one buffer, so we have to copy back the emissions of the word buffer into the main buffer. Unconditionally enable output via buffer in this patch as it yields a great opportunity for testing, i.e. all the diff tests from the test suite pass without having reordering issues (i.e. only parts of the output got buffered, and we forgot to buffer other parts). The test suite passes, which gives confidence that we converted all functions to use emit_string for output. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	146fdb0dfe	diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	30b7e1e7ef	diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	bd033291d5	diff.c: convert word diffing to use emit_diff_symbol The word diffing is not line oriented and would need some serious effort to be transformed into a line oriented approach, so just go with a symbol DIFF_SYMBOL_WORD_DIFF that is a partial line. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	0911c475c8	diff.c: convert show_stats to use emit_diff_symbol We call print_stat_summary from builtin/apply, so we still need the version with a file pointer, so introduce print_stat_summary_0 that uses emit_string machinery and keep print_stat_summary with the same arguments around. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	4eed0ebd4d	diff.c: convert emit_binary_diff_body to use emit_diff_symbol Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	f3597138df	submodule.c: migrate diff output to use emit_diff_symbol As the submodule process is no longer attached to the same file pointer 'o->file' as the superprojects process, there is a different result in color.c::check_auto_color. That is why we need to pass coloring explicitly, such that the submodule coloring decision will be made by the child process processing the submodule. Only DIFF_SYMBOL_SUBMODULE_PIPETHROUGH contains color, the other symbols are for embedding the submodule output into the superprojects output. Remove the colors from the function signatures, as all the coloring decisions will be made either inside the child process or the final emit_diff_symbol, but not in the functions driving the submodule diff. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	5af6ea957c	diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	4acaaa7af6	diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES we could save a little bit of memory when buffering in a later mode by just passing the inner part ("%s and %s", file1, file 2), but those a just a few bytes, so instead let's reuse the implementation from DIFF_SYMBOL_HEADER and keep the whole line around. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	a29b0a13bd	diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER The header is constructed lazily including line breaks, so just emit the raw string as is. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	3ee8b7bfe4	diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS} We have to use fprintf instead of emit_line, because we want to emit the tab after the color. This is important for ancient versions of gnu patch AFAICT, although we probably do not want to feed colored output to the patch utility, such that it would not matter if the trailing tab is colored. Keep the corner case as-is though. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	f2bb1218f1	diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE The context marker use the exact same output pattern, so reuse it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	ff958679cd	diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	091f8e28b4	diff.c: migrate emit_line_checked to use emit_diff_symbol Add a new flags field to emit_diff_symbol, that will be used by context lines for: * white space rules that are applicable (The first 12 bits) Take a note in cahe.c as well, when this ws rules are extended we have to fix the bits in the flags field. * how the rules are evaluated (actually this double encodes the sign of the line, but the code is easier to keep this way, bits 13,14,15) * if the line a blank line at EOF (bit 16) The check if new lines need to be marked up as extra lines at the end of file, is now done unconditionally. That should be ok, as 'new_blank_line_at_eof' has a quick early return. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	b9cbfde6b1	diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	68abc6f1c7	diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	c64b420b4c	diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	36a4cefdf4	diff.c: introduce emit_diff_symbol In a later patch we want to buffer all output before emitting it as a new feature ("markup moved lines") conceptually cannot be implemented in a single pass over the output. There are different approaches to buffer all output such as: * Buffering on the char level, i.e. we'd have a char[] which would grow at approximately 80 characters a line. This would keep the output completely unstructured, but might be very easy to implement, such as redirecting all output to a temporary file and working off that. The later passes over the buffer are quite complicated though, because we have to parse back any output and then decide if it should be modified. * Buffer on a line level. As the output is mostly line oriented already, this would make sense, but it still is a bit awkward as we'd have to make sense of it again by looking at the first characters of a line to decide what part of a diff a line is. * Buffer semantically. Imagine there is a formal grammar for the diff output and we'd keep the symbols of this grammar around. This keeps the highest level of structure in the buffered data, such that the actual memory requirements are less than say the first option. Instead of buffering the characters of the line, we'll buffer what we intend to do plus additional information for the specifics. An output of diff --git a/new.txt b/new.txt index fa69b07..412428c 100644 Binary files a/new.txt and b/new.txt differ could be buffered as DIFF_SYMBOL_DIFF_START + new.txt DIFF_SYMBOL_INDEX_MODE + fa69b07 412428c "non-executable" flag DIFF_SYMBOL_BINARY_FILES + new.txt This and the following patches introduce the third option of buffering by first moving any output to emit_diff_symbol, and then introducing the buffering in this function. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	ec33150671	diff.c: factor out diff_flush_patch_all_file_pairs In a later patch we want to do more things before and after all filepairs are flushed. So factor flushing out all file pairs into its own function that the new code can be plugged in easily. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	dfb7728f63	diff.c: move line ending check into emit_hunk_header The emit_hunk_header() function is responsible for assembling a hunk header and calling emit_line() to send the hunk header to the output file. Its only caller fn_out_consume() needs to prepare for a case where the function emits an incomplete line and add the terminating LF. Instead make sure emit_hunk_header() to always send a completed line to emit_line(). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	f2d2a5def0	diff.c: readability fix We already have dereferenced 'p->two' into a local variable 'two'. Use that. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Junio C Hamano	50f03c6676	Merge branch 'ab/free-and-null' A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL	2017-06-24 14:28:41 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Junio C Hamano	5812b3f73b	Merge branch 'bw/ls-files-sans-the-index' Code clean-up. * bw/ls-files-sans-the-index: ls-files: factor out tag calculation ls-files: factor out debug info into a function ls-files: convert show_files to take an index ls-files: convert show_ce_entry to take an index ls-files: convert prune_cache to take an index ls-files: convert ce_excluded to take an index ls-files: convert show_ru_info to take an index ls-files: convert show_other_files to take an index ls-files: convert show_killed_files to take an index ls-files: convert write_eolinfo to take an index ls-files: convert overlay_tree_on_cache to take an index tree: convert read_tree to take an index parameter convert: convert renormalize_buffer to take an index convert: convert convert_to_git to take an index convert: convert convert_to_git_filter_fd to take an index convert: convert crlf_to_git to take an index convert: convert get_cached_convert_stats_ascii to take an index	2017-06-24 14:28:40 -07:00
Junio C Hamano	a6f38c109b	Merge branch 'bw/object-id' Conversion from uchar[20] to struct object_id continues. * bw/object-id: (33 commits) diff: rename diff_fill_sha1_info to diff_fill_oid_info diffcore-rename: use is_empty_blob_oid tree-diff: convert path_appendnew to object_id tree-diff: convert diff_tree_paths to struct object_id tree-diff: convert try_to_follow_renames to struct object_id builtin/diff-tree: cleanup references to sha1 diff-tree: convert diff_tree_sha1 to struct object_id notes-merge: convert write_note_to_worktree to struct object_id notes-merge: convert verify_notes_filepair to struct object_id notes-merge: convert find_notes_merge_pair_ps to struct object_id notes-merge: convert merge_from_diffs to struct object_id notes-merge: convert notes_merge* to struct object_id tree-diff: convert diff_root_tree_sha1 to struct object_id combine-diff: convert find_paths_* to struct object_id combine-diff: convert diff_tree_combined to struct object_id diff: convert diff_flush_patch_id to struct object_id patch-ids: convert to struct object_id diff: finish conversion for prepare_temp_file to struct object_id diff: convert reuse_worktree_file to struct object_id diff: convert fill_filespec to struct object_id ...	2017-06-19 12:38:44 -07:00
Ævar Arnfjörð Bjarmason	6a83d90207	coccinelle: make use of the "type" FREE_AND_NULL() rule Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-16 12:44:03 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	b9a7d55d93	Merge branch 'nd/fopen-errors' We often try to open a file for reading whose existence is optional, and silently ignore errors from open/fopen; report such errors if they are not due to missing files. * nd/fopen-errors: mingw_fopen: report ENOENT for invalid file names mingw: verify that paths are not mistaken for remote nicknames log: fix memory leak in open_next_file() rerere.c: move error_errno() closer to the source system call print errno when reporting a system call error wrapper.c: make warn_on_inaccessible() static wrapper.c: add and use fopen_or_warn() wrapper.c: add and use warn_on_fopen_errors() config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD clone: use xfopen() instead of fopen() use xfopen() in more places git_fopen: fix a sparse 'not declared' warning	2017-06-13 13:47:09 -07:00
Brandon Williams	82b474e025	convert: convert convert_to_git to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-13 11:40:51 -07:00
Brandon Williams	94e327e973	diff: rename diff_fill_sha1_info to diff_fill_oid_info Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:23:58 +09:00
Junio C Hamano	583c6a2295	Merge branch 'js/blame-lib' The internal logic used in "git blame" has been libified to make it easier to use by cgit. * js/blame-lib: (29 commits) blame: move entry prepend to libgit blame: move scoreboard setup to libgit blame: move scoreboard-related methods to libgit blame: move fake-commit-related methods to libgit blame: move origin-related methods to libgit blame: move core structures to header blame: create entry prepend function blame: create scoreboard setup function blame: create scoreboard init function blame: rework methods that determine 'final' commit blame: wrap blame_sort and compare_blame_final blame: move progress updates to a scoreboard callback blame: make sanity_check use a callback in scoreboard blame: move no_whole_file_rename flag to scoreboard blame: move xdl_opts flags to scoreboard blame: move show_root flag to scoreboard blame: move reverse flag to scoreboard blame: move contents_from to scoreboard blame: move copy/move thresholds to scoreboard blame: move stat counters to scoreboard ...	2017-06-05 09:18:12 +09:00
Junio C Hamano	53083f8547	Merge branch 'mb/diff-default-to-indent-heuristics' Make the "indent" heuristics the default in "diff" and diff.indentHeuristics configuration variable an escape hatch for those who do no want it. * mb/diff-default-to-indent-heuristics: add--interactive: drop diff.indentHeuristic handling diff: enable indent heuristic by default diff: have the diff-* builtins configure diff before initializing revisions diff: make the indent heuristic part of diff's basic configuration	2017-06-05 09:18:10 +09:00
Brandon Williams	bd25f28876	diff: convert diff_flush_patch_id to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	74014152be	diff: finish conversion for prepare_temp_file to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	fb4a1c0dc8	diff: convert reuse_worktree_file to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	f9704c2d82	diff: convert fill_filespec to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	94a0097a41	diff: convert diff_change to struct object_id Convert diff_change to take a struct object_id. In addition convert the function pointer type 'change_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	c26022ea8f	diff: convert diff_addremove to struct object_id Convert diff_addremove to take a struct object_id. In addtion convert the function pointer type 'add_remove_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Junio C Hamano	6b526ced6f	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-29 12:34:43 +09:00
Nguyễn Thái Ngọc Duy	23a9e0712d	use xfopen() in more places xfopen() - provides error details - explains error on reading, or writing, or whatever operation - has l10n support - prints file name in the error Some of these are missing in the places that are replaced with xfopen(), which is a clear win. In some other places, it's just less code (not as clearly a win as the previous case but still is). The only slight regresssion is in remote-testsvn, where we don't report the file class (marks files) in the error messages anymore. But since this is a _test_ svn remote transport, I'm not too concerned. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:33:55 +09:00
Jeff Smith	3a35cb2ea8	blame: move textconv_object with related functions textconv_object is used in places other than blame.c and should be moved to a more appropriate location. Other textconv related functions are located in diff.c so that seems as good a place as any. Signed-off-by: Jeff Smith <whydoubt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 15:41:50 +09:00
Stefan Beller	33de716387	diff: enable indent heuristic by default The feature was included in v2.11 (released 2016-11-29) and we got no negative feedback. Quite the opposite, all feedback we got was positive. Turn it on by default. Users who dislike the feature can turn it off by setting diff.indentHeuristic (which also configures plumbing commands, see prior patches). The change to t/t4051-diff-function-context.sh is needed because the heuristic shifts the changed hunk in the patch. To get the same result regardless of the heuristic configuration, we modify the test file differently: We insert a completely new line after line 2, instead of simply duplicating it. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-09 12:24:35 +09:00
Marc Branchaud	cf5e77223a	diff: make the indent heuristic part of diff's basic configuration This heuristic was originally introduced as an experimental feature, and therefore part of the UI configuration. But the user often sees diffs generated by plumbing commands like diff-tree. Moving the indent heuristic into diff's basic configuration prepares the way for diff plumbing commands to respect the setting. The heuristic itself merely makes the diffs more aesthetically pleasing, without changing their correctness. Scripts that rely on the diff plumbing commands should not care whether or not the heuristic is employed. Signed-off-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-09 12:24:34 +09:00
brian m. carlson	569aa376ea	notes-cache: convert to struct object_id Convert as many instances of unsigned char [20] as possible. Update the callers of notes_cache_get and notes_cache_put to use the new interface. Among the functions updated are callers of lookup_commit_reference_gently, which we will soon convert. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
René Genz	5621760f59	fix minor typos Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: René Genz <liebundartig@freenet.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-01 11:01:52 +09:00
Junio C Hamano	b1081e4004	Merge branch 'bc/object-id' Conversion from unsigned char [40] to struct object_id continues. * bc/object-id: Documentation: update and rename api-sha1-array.txt Rename sha1_array to oid_array Convert sha1_array_for_each_unique and for_each_abbrev to object_id Convert sha1_array_lookup to take struct object_id Convert remaining callers of sha1_array_lookup to object_id Make sha1_array_append take a struct object_id * sha1-array: convert internal storage for struct sha1_array to object_id builtin/pull: convert to struct object_id submodule: convert check_for_new_submodule_commits to object_id sha1_name: convert disambiguate_hint_fn to take object_id sha1_name: convert struct disambiguate_state to object_id test-sha1-array: convert most code to struct object_id parse-options-cb: convert sha1_array_append caller to struct object_id fsck: convert init_skiplist to struct object_id builtin/receive-pack: convert portions to struct object_id builtin/pull: convert portions to struct object_id builtin/diff: convert to struct object_id Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Define new hash-size constants for allocating memory	2017-04-19 21:37:13 -07:00
Jeff King	977db6b4bf	diff: avoid fixed-size buffer for patch-ids To generate a patch id, we format the diff header into a fixed-size buffer, and then feed the result to our sha1 computation. The fixed buffer has size '4*PATH_MAX + 20', which in theory accommodates the four filenames plus some extra data. Except: 1. The filenames may not be constrained to PATH_MAX. The static value may not be a real limit on the current filesystem. Moreover, we may compute patch-ids for names stored only in git, without touching the current filesystem at all. 2. The 20 bytes is not nearly enough to cover the extra content we put in the buffer. As a result, the data we feed to the sha1 computation may be truncated, and it's possible that a commit with a very long filename could erroneously collide in the patch-id space with another commit. For instance, if one commit modified "really-long-filename/foo" and another modified "bar" in the same directory. In practice this is unlikely. Because the filenames are repeated, and because there's a single cutoff at the end of the buffer, the offending filename would have to be on the order of four times larger than PATH_MAX. We could fix this by moving to a strbuf. However, we can observe that the purpose of formatting this in the first place is to feed it to git_SHA1_Update(). So instead, let's just feed each part of the formatted string directly. This actually ends up more readable, and we can even factor out some duplicated bits from the various conditional branches. Technically this may change the output of patch-id for very long filenames, but it's not worth making an exception for this in the --stable output. It was a bug, and one that only affected an unlikely set of paths. And anyway, the exact value would have varied from platform to platform depending on the value of PATH_MAX, so there is no "stable" value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-30 14:58:29 -07:00
brian m. carlson	dc01505f7f	Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Since we will likely be introducing a new hash function at some point, and that hash function might be longer than 40 hex characters, use the constant GIT_MAX_HEXSZ, which is designed to be suitable for allocations, instead of GIT_SHA1_HEXSZ. This will ease the transition down the line by distinguishing between places where we need to allocate memory suitable for the largest hash from those where we need to handle the current hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-26 22:08:21 -07:00
Jeff King	e4da43b1f0	prefix_filename: return newly allocated string The prefix_filename() function returns a pointer to static storage, which makes it easy to use dangerously. We already fixed one buggy caller in hash-object recently, and the calls in apply.c are suspicious (I didn't dig in enough to confirm that there is a bug, but we call the function once in apply_all_patches() and then again indirectly from parse_chunk()). Let's make it harder to get wrong by allocating the return value. For simplicity, we'll do this even when the prefix is empty (and we could just return the original file pointer). That will cause us to allocate sometimes when we wouldn't otherwise need to, but this function isn't called in performance critical code-paths (and it already _might_ allocate on any given call, so a caller that cares about performance is questionable anyway). The downside is that the callers need to remember to free() the result to avoid leaking. Most of them already used xstrdup() on the result, so we know they are OK. The remainder have been converted to use free() as appropriate. I considered retaining a prefix_filename_unsafe() for cases where we know the static lifetime is OK (and handling the cleanup is awkward). This is only a handful of cases, though, and it's not worth the mental energy in worrying about whether the "unsafe" variant is OK to use in any situation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-21 11:18:41 -07:00
Jeff King	116fb64e43	prefix_filename: drop length parameter This function takes the prefix as a ptr/len pair, but in every caller the length is exactly strlen(ptr). Let's simplify the interface and just take the string. This saves callers specifying it (and in some cases handling a NULL prefix). In a handful of cases we had the length already without calling strlen, so this is technically slower. But it's not likely to matter (after all, if the prefix is non-empty we'll allocate and copy it into a buffer anyway). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-21 11:12:53 -07:00
Junio C Hamano	60f335b87f	Merge branch 'jc/diff-populate-filespec-size-only-fix' "git diff --quiet" relies on the size field in diff_filespec to be correctly populated, but diff_populate_filespec() helper function made an incorrect short-cut when asked only to populate the size field for paths that need to go through convert_to_git() (e.g. CRLF conversion). * jc/diff-populate-filespec-size-only-fix: diff: do not short-cut CHECK_SIZE_ONLY check in diff_populate_filespec()	2017-03-12 23:21:36 -07:00
Junio C Hamano	12426e114b	diff: do not short-cut CHECK_SIZE_ONLY check in diff_populate_filespec() Callers of diff_populate_filespec() can choose to ask only for the size of the blob without grabbing the blob data, and the function, after running lstat() when the filespec points at a working tree file, returns by copying the value in size field of the stat structure into the size field of the filespec when this is the case. However, this short-cut cannot be taken if the contents from the path needs to go through convert_to_git(), whose resulting real blob data may be different from what is in the working tree file. As "git diff --quiet" compares the .size fields of filespec structures to skip content comparison, this bug manifests as a false "there are differences" for a file that needs eol conversion, for example. Reported-by: Mike Crowe <mac@mcrowe.com> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 10:48:06 -08:00
Junio C Hamano	cbf1860d73	Merge branch 'rs/swap' Code clean-up. * rs/swap: graph: use SWAP macro diff: use SWAP macro use SWAP macro apply: use SWAP macro add SWAP macro	2017-02-15 12:54:19 -08:00
Junio C Hamano	e53c7f8731	Merge branch 'jk/log-graph-name-only' "git log --graph" did not work well with "--name-only", even though other forms of "diff" output were handled correctly. * jk/log-graph-name-only: diff: print line prefix for --name-only output	2017-02-10 12:52:27 -08:00
Jeff King	f5022b5fed	diff: print line prefix for --name-only output If you run "git log --graph --name-only", the pathnames are not indented to go along with their matching commits (unlike all of the other diff formats). We need to output the line prefix for each item before writing it. The tests cover both --name-status and --name-only. The former actually gets this right already, because it builds on the --raw format functions. It's only --name-only which uses its own code (and this fix mirrors the code in diff_flush_raw()). Note that the tests don't follow our usual style of setting up the "expect" output inside the test block. This matches the surrounding style, but more importantly it is easier to read: we don't have to worry about embedded single-quotes, and the leading indentation is more obvious. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-08 13:39:57 -08:00
René Scharfe	2490574d15	use oid_to_hex_r() for converting struct object_id hashes to hex strings Patch generated by Coccinelle and contrib/coccinelle/object_id.cocci. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:23:40 -08:00
René Scharfe	402bf8e198	diff: use SWAP macro Use the macro SWAP to exchange the value of pairs of variables instead of swapping them manually with the help of a temporary variable. The resulting code is shorter and easier to read. The two cases were not transformed by the semantic patch swap.cocci because it's extra careful and handles only cases where the types of all variables are the same -- and here we swap two ints and use an unsigned temporary variable for that. Nevertheless the conversion is safe, as the value range is preserved with and without the patch. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:23:00 -08:00
René Scharfe	35d803bc9a	use SWAP macro Apply the semantic patch swap.cocci to convert hand-rolled swaps to use the macro SWAP. The resulting code is shorter and easier to read, the object code is effectively unchanged. The patch for object.c had to be hand-edited in order to preserve the comment before the change; Coccinelle tried to eat it for some reason. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:17:00 -08:00
Vegard Nossum	c488867793	diff: add interhunk context config option The --inter-hunk-context= option was added in commit `6d0e674a57` ("diff: add option to show context between close hunks"). This patch allows configuring a default for this option. Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-12 12:55:43 -08:00
Junio C Hamano	2ced5f2c2d	Merge branch 'jc/retire-compaction-heuristics' "git diff" and its family had two experimental heuristics to shift the contents of a hunk to make the patch easier to read. One of them turns out to be better than the other, so leave only the "--indent-heuristic" option and remove the other one. * jc/retire-compaction-heuristics: diff: retire "compaction" heuristics	2017-01-10 15:24:27 -08:00
Junio C Hamano	3cde4e02ee	diff: retire "compaction" heuristics When a patch inserts a block of lines, whose last lines are the same as the existing lines that appear before the inserted block, "git diff" can choose any place between these existing lines as the boundary between the pre-context and the added lines (adjusting the end of the inserted block as appropriate) to come up with variants of the same patch, and some variants are easier to read than others. We have been trying to improve the choice of this boundary, and Git 2.11 shipped with an experimental "compaction-heuristic". Since then another attempt to improve the logic further resulted in a new "indent-heuristic" logic. It is agreed that the latter gives better result overall, and the former outlived its usefulness. Retire "compaction", and keep "indent" as an experimental feature. The latter hopefully will be turned on by default in a future release, but that should be done as a separate step. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-23 12:32:22 -08:00
Jack Bates	43d1948b7b	diff: handle --no-abbrev in no-index case There are two different places where the --no-abbrev option is parsed, and two different places where SHA-1s are abbreviated. We normally parse --no-abbrev with setup_revisions(), but in the no-index case, "git diff" calls diff_opt_parse() directly, and diff_opt_parse() didn't handle --no-abbrev until now. (It did handle --abbrev, however.) We normally abbreviate SHA-1s with find_unique_abbrev(), but commit `4f03666` ("diff: handle sha1 abbreviations outside of repository, 2016-10-20) recently introduced a special case when you run "git diff" outside of a repository. setup_revisions() does also call diff_opt_parse(), but not for --abbrev or --no-abbrev, which it handles itself. setup_revisions() sets rev_info->abbrev, and later copies that to diff_options->abbrev. It handles --no-abbrev by setting abbrev to zero. (This change doesn't touch that.) Setting abbrev to zero was broken in the outside-of-a-repository special case, which until now resulted in a truly zero-length SHA-1, rather than taking zero to mean do not abbreviate. The only way to trigger this bug, however, was by running "git diff --raw" without either the --abbrev or --no-abbrev options, because 1) without --raw it doesn't respect abbrev (which is bizarre, but has been that way forever), 2) we silently clamp --abbrev=0 to MINIMUM_ABBREV, and 3) --no-abbrev wasn't handled until now. The outside-of-a-repository case is one of three no-index cases. The other two are when one of the files you're comparing is outside of the repository you're in, and the --no-index option. Signed-off-by: Jack Bates <jack@nottheoilrig.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-08 14:40:30 -08:00
Junio C Hamano	0a79ccaac7	Merge branch 'tk/diffcore-delta-remove-unused' into maint Code cleanup. * tk/diffcore-delta-remove-unused: diffcore-delta: remove unused parameter to diffcore_count_changes()	2016-11-29 13:28:03 -08:00
Junio C Hamano	6d40812e4b	Merge branch 'tk/diffcore-delta-remove-unused' Code cleanup. * tk/diffcore-delta-remove-unused: diffcore-delta: remove unused parameter to diffcore_count_changes()	2016-11-17 13:45:22 -08:00
Tobias Klauser	974e0044d6	diffcore-delta: remove unused parameter to diffcore_count_changes() The delta_limit parameter to diffcore_count_changes() has been unused since commit `ba23bbc8e` ("diffcore-delta: make change counter to byte oriented again.", 2006-03-04). Remove the parameter and adjust all callers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-14 09:24:04 -08:00
Junio C Hamano	c8fd220175	Merge branch 'rs/cocci' into maint Code cleanup. * rs/cocci: use strbuf_add_unique_abbrev() for adding short hashes, part 3 remove unnecessary NULL check before free(3) coccicheck: make transformation for strbuf_addf(sb, "...") more precise use strbuf_add_unique_abbrev() for adding short hashes, part 2 use strbuf_addstr() instead of strbuf_addf() with "%s", part 2 gitignore: ignore output files of coccicheck make target use strbuf_addstr() for adding constant strings to a strbuf, part 2 add coccicheck make target contrib/coccinelle: fix semantic patch for oid_to_hex_r()	2016-10-28 09:01:23 -07:00
Junio C Hamano	650360210a	Merge branch 'nd/ita-empty-commit' When new paths were added by "git add -N" to the index, it was enough to circumvent the check by "git commit" to refrain from making an empty commit without "--allow-empty". The same logic prevented "git status" to show such a path as "new file" in the "Changes not staged for commit" section. * nd/ita-empty-commit: commit: don't be fooled by ita entries when creating initial commit commit: fix empty commit creation when there's no changes but ita entries diff: add --ita-[in]visible-in-index diff-lib: allow ita entries treated as "not yet exist in index"	2016-10-27 14:58:50 -07:00
Junio C Hamano	0d9c527d59	Merge branch 'jk/no-looking-at-dotgit-outside-repo' Update "git diff --no-index" codepath not to try to peek into .git/ directory that happens to be under the current directory, when we know we are operating outside any repository. * jk/no-looking-at-dotgit-outside-repo: diff: handle sha1 abbreviations outside of repository diff_aligned_abbrev: use "struct oid" diff_unique_abbrev: rename to diff_aligned_abbrev find_unique_abbrev: use 4-buffer ring test-*-cache-tree: setup git dir read info/{attributes,exclude} only when in repository	2016-10-27 14:58:48 -07:00
Junio C Hamano	580d820ece	Merge branch 'lt/abbrev-auto' Allow the default abbreviation length, which has historically been 7, to scale as the repository grows. The logic suggests to use 12 hexdigits for the Linux kernel, and 9 to 10 for Git itself. * lt/abbrev-auto: abbrev: auto size the default abbreviation abbrev: prepare for new world order abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing	2016-10-27 14:58:47 -07:00
Jeff King	4f03666ac6	diff: handle sha1 abbreviations outside of repository When generating diffs outside a repository (e.g., with "diff --no-index"), we may write abbreviated sha1s as part of "--raw" output or the "index" lines of "--patch" output. Since we have no object database, we never find any collisions, and these sha1s get whatever static abbreviation length is configured (typically 7). However, we do blindly look in ".git/objects" to see if any objects exist, even though we know we are not in a repository. This is usually harmless because such a directory is unlikely to exist, but could be wrong in rare circumstances. Let's instead notice when we are not in a repository and behave as if the object database is empty (i.e., just use the default abbrev length). It would perhaps make sense to be conservative and show full sha1s in that case, but showing the default abbreviation is what we've always done (and is certainly less ugly). Note that this does mean that: cd /not/a/repo GIT_OBJECT_DIRECTORY=/some/real/objdir git diff --no-index ... used to look for collisions in /some/real/objdir but now does not. This could be considered either a bugfix (we do not look at objects if we have no repository) or a regression, but it seems unlikely that anybody would care much either way. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Jeff King	d6cece51b8	diff_aligned_abbrev: use "struct oid" Since we're modifying this function anyway, it's a good time to update it to the more modern "struct oid". We can also drop some of the magic numbers in favor of GIT_SHA1_HEXSZ, along with some descriptive comments. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Jeff King	d5e3b01e5b	diff_unique_abbrev: rename to diff_aligned_abbrev The word "align" describes how the function actually differs from find_unique_abbrev, and will make it less confusing when we add more diff-specific abbrevation functions that do not do this alignment. Since this is a globally available function, let's also move its descriptive comment to the header file, where we typically document function interfaces. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Junio C Hamano	a5ed26702b	Merge branch 'va/i18n' More i18n. * va/i18n: i18n: diff: mark warnings for translation i18n: credential-cache--daemon: mark advice for translation i18n: convert mark error messages for translation i18n: apply: mark error message for translation i18n: apply: mark error messages for translation i18n: apply: mark info messages for translation i18n: apply: mark plural string for translation	2016-10-26 13:14:47 -07:00
Junio C Hamano	e5272d304a	Merge branch 'jc/ws-error-highlight' "git diff/log --ws-error-highlight=<kind>" lacked the corresponding configuration variable to set it by default. * jc/ws-error-highlight: diff: introduce diff.wsErrorHighlight option diff.c: move ws-error-highlight parsing helpers up diff.c: refactor parse_ws_error_highlight() t4015: split out the "setup" part of ws-error-highlight test	2016-10-26 13:14:43 -07:00
Junio C Hamano	c334effa23	Merge branch 'jc/diff-unique-abbrev-comments' A bit more comments in a tricky code. * jc/diff-unique-abbrev-comments: diff_unique_abbrev(): document its assumption and limitation	2016-10-26 13:14:42 -07:00
Nguyễn Thái Ngọc Duy	b42b451919	diff: add --ita-[in]visible-in-index The option --ita-invisible-in-index exposes the "ita_invisible_in_index" diff flag to outside to allow easier experimentation with this new mode. The "plan" is to make --ita-invisible-in-index default to keep consistent behavior with 'status' and 'commit', but a bunch other commands like 'apply', 'merge', 'reset'.... need to be taken into consideration as well. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-24 10:47:51 -07:00
Vasco Almeida	db424979a8	i18n: diff: mark warnings for translation Mark rename_limit_warning and degrade_cc_to_c_warning and rename_limit_warning for translation. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 14:51:48 -07:00
Junio C Hamano	b8688adb12	Merge branch 'rs/qsort' We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of the time third parameter is redundant. A new QSORT() macro lets us omit it. * rs/qsort: show-branch: use QSORT use QSORT, part 2 coccicheck: use --all-includes by default remove unnecessary check before QSORT use QSORT add QSORT	2016-10-10 14:03:46 -07:00
Junio C Hamano	f0798e6cdb	Merge branch 'rs/cocci' Code clean-up with help from coccinelle tool continues. * rs/cocci: coccicheck: make transformation for strbuf_addf(sb, "...") more precise use strbuf_add_unique_abbrev() for adding short hashes, part 2 use strbuf_addstr() instead of strbuf_addf() with "%s", part 2 gitignore: ignore output files of coccicheck make target	2016-10-06 14:53:12 -07:00
Junio C Hamano	a17505f262	diff: introduce diff.wsErrorHighlight option With the preparatory steps, it has become trivial to teach the system a new diff.wsErrorHighlight configuration that gives the default value for --ws-error-highlight command line option. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-04 15:49:05 -07:00
Junio C Hamano	0b4b42e7fe	diff.c: move ws-error-highlight parsing helpers up These need to be usable from git_diff_ui_config() code to help parsing a configuration variable, so move them up. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-04 15:49:05 -07:00
Junio C Hamano	077965f84a	diff.c: refactor parse_ws_error_highlight() Rename the function to parse_ws_error_highlight_opt(), because it is meant to parse a command line option, and then refactor the meat of the function into a helper function that reports the parsed result which is typically a small unsigned int (these are OR'ed bitmask after all), or a negative offset that indicates where in the input string a parse error happened. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-04 15:49:05 -07:00
Junio C Hamano	7b5b7721af	abbrev: prepare for new world order The code that sets custom abbreviation length, in response to command line argument, often does something like this: if (skip_prefix(arg, "--abbrev=", &arg)) abbrev = atoi(arg); else if (!strcmp("--abbrev", &arg)) abbrev = DEFAULT_ABBREV; /* make the value sane */ if (abbrev < 0 \|\| 40 < abbrev) abbrev = ... some sane value ... However, it is pointless to sanity-check and tweak the value obtained from DEFAULT_ABBREV. We are going to allow it to be initially set to -1 to signal that the default abbreviation length must be auto sized upon the first request to abbreviate, based on the number of objects in the repository, and when that happens, rejecting or tweaking a negative value to a "saner" one will negatively interfere with the auto sizing. The codepaths for git rev-parse --short <object> git diff --raw --abbrev do exactly that; allow them to pass possibly negative abbrevs intact, that will come from DEFAULT_ABBREV in the future. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-03 12:54:22 -07:00
Junio C Hamano	d709f1fb9d	diff_unique_abbrev(): document its assumption and limitation This function is used to add "..." to displayed object names in "diff --raw --abbrev[=<n>]" output. It bases its behaviour on an untold assumption that the abbreviation length requested by the caller is "reasonble", i.e. most of the objects will abbreviate within the requested length and the resulting length would never exceed it by more than a few hexdigits (otherwise the resulting columns would not align). Explain that in a comment. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-30 18:06:50 -07:00
Junio C Hamano	300e95f7df	Merge branch 'js/regexec-buf' into maint Some codepaths in "git diff" used regexec(3) on a buffer that was mmap(2)ed, which may not have a terminating NUL, leading to a read beyond the end of the mapped region. This was fixed by introducing a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND extension. * js/regexec-buf: regex: use regexec_buf() regex: add regexec_buf() that can work on a non NUL-terminated string regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails	2016-09-29 16:49:45 -07:00
René Scharfe	9ed0d8d6e6	use QSORT Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-29 15:42:18 -07:00
René Scharfe	f937d78553	use strbuf_add_unique_abbrev() for adding short hashes, part 2 Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs instead of taking detours through find_unique_abbrev() and its static buffer. This is shorter and a bit more efficient. `1eb47f167d` already converted six cases, this patch covers three more. A semantic patch for Coccinelle is included for easier checking for new cases that might be introduced in the future. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-27 14:02:40 -07:00
Junio C Hamano	6a67695268	Merge branch 'js/regexec-buf' Some codepaths in "git diff" used regexec(3) on a buffer that was mmap(2)ed, which may not have a terminating NUL, leading to a read beyond the end of the mapped region. This was fixed by introducing a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND extension. * js/regexec-buf: regex: use regexec_buf() regex: add regexec_buf() that can work on a non NUL-terminated string regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails	2016-09-26 16:09:19 -07:00
Junio C Hamano	8969feac7e	Merge branch 'va/i18n-more' Even more i18n. * va/i18n-more: i18n: stash: mark messages for translation i18n: notes-merge: mark die messages for translation i18n: ident: mark hint for translation i18n: i18n: diff: mark die messages for translation i18n: connect: mark die messages for translation i18n: commit: mark message for translation	2016-09-26 16:09:18 -07:00
Junio C Hamano	b7af6ae5cf	Merge branch 'mh/diff-indent-heuristic' Output from "git diff" can be made easier to read by selecting which lines are common and which lines are added/deleted intelligently when the lines before and after the changed section are the same. A command line option is added to help with the experiment to find a good heuristics. * mh/diff-indent-heuristic: blame: honor the diff heuristic options and config parse-options: add parse_opt_unknown_cb() diff: improve positioning of add/delete blocks in diffs xdl_change_compact(): introduce the concept of a change group recs_match(): take two xrecord_t pointers as arguments is_blank_line(): take a single xrecord_t as argument xdl_change_compact(): only use heuristic if group can't be matched xdl_change_compact(): fix compaction heuristic to adjust ixo	2016-09-26 16:09:16 -07:00
Johannes Schindelin	b7d36ffca0	regex: use regexec_buf() The new regexec_buf() function operates on buffers with an explicitly specified length, rather than NUL-terminated strings. We need to use this function whenever the buffer we want to pass to regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated). Note: the original motivation for this patch was to fix a bug where `git diff -G <regex>` would crash. This patch converts more callers, though, some of which allocated to construct NUL-terminated strings, or worse, modified buffers to temporarily insert NULs while calling regexec(3). By converting them to use regexec_buf(), the code has become much cleaner. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-21 13:56:15 -07:00
Jean-Noël AVILA	a2f05c9454	i18n: i18n: diff: mark die messages for translation While marking individual messages for translation, consolidate some messages "option 'foo' requires a value" that is used for many options into one by introducing a helper function to die with the message with the option name embedded in it, and ask the translators to localize that single message instead. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Jean-Noel Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-21 10:18:33 -07:00
Junio C Hamano	4af9a7d344	Merge branch 'bc/object-id' The "unsigned char sha1[20]" to "struct object_id" conversion continues. Notable changes in this round includes that ce->sha1, i.e. the object name recorded in the cache_entry, turns into an object_id. It had merge conflicts with a few topics in flight (Christian's "apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's "status --porcelain-v2"). Extra sets of eyes double-checking for mismerges are highly appreciated. * bc/object-id: builtin/reset: convert to use struct object_id builtin/commit-tree: convert to struct object_id builtin/am: convert to struct object_id refs: add an update_ref_oid function. sha1_name: convert get_sha1_mb to struct object_id builtin/update-index: convert file to struct object_id notes: convert init_notes to use struct object_id builtin/rm: convert to use struct object_id builtin/blame: convert file to use struct object_id Convert read_mmblob to take struct object_id. notes-merge: convert struct notes_merge_pair to struct object_id builtin/checkout: convert some static functions to struct object_id streaming: make stream_blob_to_fd take struct object_id builtin: convert textconv_object to use struct object_id builtin/cat-file: convert some static functions to struct object_id builtin/cat-file: convert struct expand_data to use struct object_id builtin/log: convert some static functions to use struct object_id builtin/blame: convert struct origin to use struct object_id builtin/apply: convert static functions to struct object_id cache: convert struct cache_entry to use struct object_id	2016-09-19 13:47:19 -07:00
Michael Haggerty	5b162879e9	blame: honor the diff heuristic options and config Teach "git blame" and "git annotate" the --compaction-heuristic and --indent-heuristic options that are now supported by "git diff". Also teach them to honor the `diff.compactionHeuristic` and `diff.indentHeuristic` configuration options. It would be conceivable to introduce separate configuration options for "blame" and "annotate"; for example `blame.compactionHeuristic` and `blame.indentHeuristic`. But it would be confusing to users if blame output is inconsistent with diff output, so it makes more sense for them to respect the same configuration. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-19 10:25:11 -07:00
Michael Haggerty	433860f3d0	diff: improve positioning of add/delete blocks in diffs Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+\|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit `d634d61ed6` improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: \| repository \| count \| Git 2.9.0 \| compaction \| compaction-fixed \| indent-1 \| indent-2 \| \| --------------------- \| ----- \| -------------- \| -------------- \| ---------------- \| -------------- \| -------------- \| \| afnetworking \| 109 \| 89 (81.7%) \| 37 (33.9%) \| 37 (33.9%) \| 2 (1.8%) \| 2 (1.8%) \| \| alamofire \| 30 \| 18 (60.0%) \| 14 (46.7%) \| 15 (50.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| angular \| 184 \| 127 (69.0%) \| 39 (21.2%) \| 23 (12.5%) \| 5 (2.7%) \| 5 (2.7%) \| \| animate \| 313 \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| \| ant \| 380 \| 356 (93.7%) \| 152 (40.0%) \| 148 (38.9%) \| 15 (3.9%) \| 15 (3.9%) \| * \| bugzilla \| 306 \| 263 (85.9%) \| 109 (35.6%) \| 99 (32.4%) \| 14 (4.6%) \| 15 (4.9%) \| * \| corefx \| 126 \| 91 (72.2%) \| 22 (17.5%) \| 21 (16.7%) \| 6 (4.8%) \| 6 (4.8%) \| \| couchdb \| 78 \| 44 (56.4%) \| 26 (33.3%) \| 28 (35.9%) \| 6 (7.7%) \| 6 (7.7%) \| * \| cpython \| 937 \| 158 (16.9%) \| 50 (5.3%) \| 49 (5.2%) \| 5 (0.5%) \| 5 (0.5%) \| * \| discourse \| 160 \| 95 (59.4%) \| 42 (26.2%) \| 36 (22.5%) \| 18 (11.2%) \| 13 (8.1%) \| \| docker \| 307 \| 194 (63.2%) \| 198 (64.5%) \| 253 (82.4%) \| 8 (2.6%) \| 8 (2.6%) \| * \| electron \| 163 \| 132 (81.0%) \| 38 (23.3%) \| 39 (23.9%) \| 6 (3.7%) \| 6 (3.7%) \| \| git \| 536 \| 470 (87.7%) \| 73 (13.6%) \| 78 (14.6%) \| 16 (3.0%) \| 16 (3.0%) \| * \| gitflow \| 127 \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| ionic \| 133 \| 89 (66.9%) \| 29 (21.8%) \| 38 (28.6%) \| 1 (0.8%) \| 1 (0.8%) \| \| ipython \| 482 \| 362 (75.1%) \| 167 (34.6%) \| 169 (35.1%) \| 11 (2.3%) \| 11 (2.3%) \| * \| junit \| 161 \| 147 (91.3%) \| 67 (41.6%) \| 66 (41.0%) \| 1 (0.6%) \| 1 (0.6%) \| * \| lighttable \| 15 \| 5 (33.3%) \| 0 (0.0%) \| 2 (13.3%) \| 0 (0.0%) \| 0 (0.0%) \| \| magit \| 88 \| 75 (85.2%) \| 11 (12.5%) \| 9 (10.2%) \| 1 (1.1%) \| 0 (0.0%) \| \| neural-style \| 28 \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| nodejs \| 781 \| 649 (83.1%) \| 118 (15.1%) \| 111 (14.2%) \| 4 (0.5%) \| 5 (0.6%) \| * \| phpmyadmin \| 491 \| 481 (98.0%) \| 75 (15.3%) \| 48 (9.8%) \| 2 (0.4%) \| 2 (0.4%) \| * \| react-native \| 168 \| 130 (77.4%) \| 79 (47.0%) \| 81 (48.2%) \| 0 (0.0%) \| 0 (0.0%) \| \| rust \| 171 \| 128 (74.9%) \| 30 (17.5%) \| 27 (15.8%) \| 16 (9.4%) \| 14 (8.2%) \| \| spark \| 186 \| 149 (80.1%) \| 52 (28.0%) \| 52 (28.0%) \| 2 (1.1%) \| 2 (1.1%) \| \| tensorflow \| 115 \| 66 (57.4%) \| 48 (41.7%) \| 48 (41.7%) \| 5 (4.3%) \| 5 (4.3%) \| \| test-more \| 19 \| 15 (78.9%) \| 2 (10.5%) \| 2 (10.5%) \| 1 (5.3%) \| 1 (5.3%) \| * \| test-unit \| 51 \| 34 (66.7%) \| 14 (27.5%) \| 8 (15.7%) \| 2 (3.9%) \| 2 (3.9%) \| * \| xmonad \| 23 \| 22 (95.7%) \| 2 (8.7%) \| 2 (8.7%) \| 1 (4.3%) \| 1 (4.3%) \| * \| --------------------- \| ----- \| -------------- \| -------------- \| ---------------- \| -------------- \| -------------- \| \| totals \| 6668 \| 4391 (65.9%) \| 1496 (22.4%) \| 1491 (22.4%) \| 150 (2.2%) \| 144 (2.2%) \| \| totals (training set) \| 4552 \| 3195 (70.2%) \| 1053 (23.1%) \| 1061 (23.3%) \| 86 (1.9%) \| 88 (1.9%) \| \| totals (test set) \| 2116 \| 1196 (56.5%) \| 443 (20.9%) \| 430 (20.3%) \| 64 (3.0%) \| 56 (2.6%) \| In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got wrong. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. This patch adds a new command-line option `--indent-heuristic`, and a new configuration setting `diff.indentHeuristic`, that activate this heuristic. This interface is only meant for testing purposes, and should be finalized before including this change in any release. [1] https://github.com/mhagger/diff-slider-tools Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-19 10:25:11 -07:00
Junio C Hamano	a0d9b7f015	Merge branch 'sb/diff-cleanup' Code cleanup. * sb/diff-cleanup: diff: remove dead code diff: omit found pointer from emit_callback diff.c: use diff_options directly	2016-09-15 14:11:16 -07:00
Junio C Hamano	305d7f1339	Merge branch 'jk/diff-submodule-diff-inline' The "git diff --submodule={short,log}" mechanism has been enhanced to allow "--submodule=diff" to show the patch between the submodule commits bound to the superproject. * jk/diff-submodule-diff-inline: diff: teach diff to display submodule difference with an inline diff submodule: refactor show_submodule_summary with helper function submodule: convert show_submodule_summary to use struct object_id * allow do_submodule_path to work even if submodule isn't checked out diff: prepare for additional submodule formats graph: add support for --line-prefix on all graph-aware output diff.c: remove output_prefix_length field cache: add empty_tree_oid object and helper function	2016-09-12 15:34:31 -07:00
Stefan Beller	ca9b37e5a8	diff: remove dead code When `len < 1`, len has to be 0 or negative, emit_line will then remove the first character and by then `len` would be negative. As this doesn't happen, it is safe to assume it is dead code. This continues to simplify the code, which was started in `b8d9c1a66b` (2009-09-03, diff.c: the builtin_diff() deals with only two-file comparison). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-08 13:54:37 -07:00
Stefan Beller	ba16233ccd	diff: omit found pointer from emit_callback We keep the actual data in the diff options, which are just as accessible. Remove the pointer stored in struct emit_callback for readability. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-08 13:54:23 -07:00
Stefan Beller	fb33b62ca6	diff.c: use diff_options directly The value of `ecbdata->opt` is accessible via the short variable `o` already, so let's use that instead. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-08 13:46:46 -07:00
brian m. carlson	99d1a9861a	cache: convert struct cache_entry to use struct object_id Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
Jacob Keller	fd47ae6a5b	diff: teach diff to display submodule difference with an inline diff Teach git-diff and friends a new format for displaying the difference of a submodule. The new format is an inline diff of the contents of the submodule between the commit range of the update. This allows the user to see the actual code change caused by a submodule update. Add tests for the new format and option. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:10 -07:00
Jacob Keller	602a283afb	submodule: convert show_submodule_summary to use struct object_id * Since we're going to be changing this function in a future patch, lets go ahead and convert this to use object_id now. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:10 -07:00
Jacob Keller	61cfbc054d	diff: prepare for additional submodule formats A future patch will add a new format for displaying the difference of a submodule. Make it easier by changing how we store the current selected format. Replace the DIFF_OPT flag with an enumeration, as each format will be mutually exclusive. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:09 -07:00
Jacob Keller	660e113ce1	graph: add support for --line-prefix on all graph-aware output Add an extension to git-diff and git-log (and any other graph-aware displayable output) such that "--line-prefix=<string>" will print the additional line-prefix on every line of output. To make this work, we have to fix a few bugs in the graph API that force graph_show_commit_msg to be used only when you have a valid graph. Additionally, we extend the default_diff_output_prefix handler to work even when no graph is enabled. This is somewhat of a hack on top of the graph API, but I think it should be acceptable here. This will be used by a future extension of submodule display which displays the submodule diff as the actual diff between the pre and post commit in the submodule project. Add some tests for both git-log and git-diff to ensure that the prefix is honored correctly. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:09 -07:00
Junio C Hamano	cd48dadb8d	diff.c: remove output_prefix_length field "diff/log --stat" has a logic that determines the display columns available for the diffstat part of the output and apportions it for pathnames and diffstat graph automatically. `5e71a84a` (Add output_prefix_length to diff_options, 2012-04-16) added the output_prefix_length field to diff_options structure to allow this logic to subtract the display columns used for the history graph part from the total "terminal width"; this matters when the "git log --graph -p" option is in use. The field must be set to the number of display columns needed to show the output from the output_prefix() callback, which is error prone. As there is only one user of the field, and the user has the actual value of the prefix string, let's get rid of the field and have the user count the display width itself. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:08 -07:00
Junio C Hamano	dd610aeda6	Merge branch 'kw/patch-ids-optim' When "git rebase" tries to compare set of changes on the updated upstream and our own branch, it computes patch-id for all of these changes and attempts to find matches. This has been optimized by lazily computing the full patch-id (which is expensive) to be compared only for changes that touch the same set of paths. * kw/patch-ids-optim: rebase: avoid computing unnecessary patch IDs patch-ids: add flag to create the diff patch id using header only data patch-ids: replace the seen indicator with a commit pointer patch-ids: stop using a hand-rolled hashmap implementation	2016-08-12 09:47:39 -07:00
Junio C Hamano	cee6c5b47b	Merge branch 'jk/diff-do-not-reuse-wtf-needs-cleaning' into maint There is an optimization used in "git diff $treeA $treeB" to borrow an already checked-out copy in the working tree when it is known to be the same as the blob being compared, expecting that open/mmap of such a file is faster than reading it from the object store, which involves inflating and applying delta. This however kicked in even when the checked-out copy needs to go through the convert-to-git conversion (including the clean filter), which defeats the whole point of the optimization. The optimization has been disabled when the conversion is necessary. * jk/diff-do-not-reuse-wtf-needs-cleaning: diff: do not reuse worktree files that need "clean" conversion	2016-08-10 11:55:28 -07:00
Junio C Hamano	767da54bf8	Merge branch 'jk/diff-do-not-reuse-wtf-needs-cleaning' There is an optimization used in "git diff $treeA $treeB" to borrow an already checked-out copy in the working tree when it is known to be the same as the blob being compared, expecting that open/mmap of such a file is faster than reading it from the object store, which involves inflating and applying delta. This however kicked in even when the checked-out copy needs to go through the convert-to-git conversion (including the clean filter), which defeats the whole point of the optimization. The optimization has been disabled when the conversion is necessary. * jk/diff-do-not-reuse-wtf-needs-cleaning: diff: do not reuse worktree files that need "clean" conversion	2016-08-03 15:10:29 -07:00
Kevin Willford	3e8e32c32e	patch-ids: add flag to create the diff patch id using header only data This will allow a diff patch id to be created using only the header data so that the contents of the file will not have to be loaded. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 14:10:01 -07:00
Jeff King	06dec439a3	diff: do not reuse worktree files that need "clean" conversion When accessing a blob for a diff, we may try to reuse file contents in the working tree, under the theory that it is faster to mmap those file contents than it would be to extract the content from the object database. When we have to filter those contents, though, that assumption does not hold. Even for our internal conversions like CRLF, we have to allocate and fill a new buffer anyway. But much worse, for external clean filters we have to exec an arbitrary script, and we have no idea how expensive it may be to run. So let's skip this optimization when conversion into git's "clean" form is required. This applies whenever the "want_file" flag is false. When it's true, the caller actually wants the smudged worktree contents, which the reused file by definition already has (in fact, this is a key optimization going the other direction, since reusing the worktree file there lets us skip smudge filters). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-22 12:31:24 -07:00
Junio C Hamano	a63d31b4d3	Merge branch 'bc/cocci' Conversion from unsigned char sha1[20] to struct object_id continues. * bc/cocci: diff: convert prep_temp_blob() to struct object_id merge-recursive: convert merge_recursive_generic() to object_id merge-recursive: convert leaf functions to use struct object_id merge-recursive: convert struct merge_file_info to object_id merge-recursive: convert struct stage_data to use object_id diff: rename struct diff_filespec's sha1_valid member diff: convert struct diff_filespec to struct object_id coccinelle: apply object_id Coccinelle transformations coccinelle: convert hashcpy() with null_sha1 to hashclr() contrib/coccinelle: add basic Coccinelle transforms hex: add oid_to_hex_r()	2016-07-19 13:22:16 -07:00
brian m. carlson	09bdff29e1	diff: convert prep_temp_blob() to struct object_id All of the callers of this function use struct object_id, so convert it to use struct object_id in its arguments and internally. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
brian m. carlson	41c9560ee5	diff: rename struct diff_filespec's sha1_valid member Now that this struct's sha1 member is called "oid", update the comment and the sha1_valid member to be called "oid_valid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1_valid + o.oid_valid @@ struct diff_filespec *p; @@ - p->sha1_valid + p->oid_valid Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
brian m. carlson	a0d12c4433	diff: convert struct diff_filespec to struct object_id Convert struct diff_filespec's sha1 member to use a struct object_id called "oid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1 + o.oid.hash @@ struct diff_filespec *p; @@ - p->sha1 + p->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
brian m. carlson	f449198e58	coccinelle: convert hashcpy() with null_sha1 to hashclr() hashcpy with null_sha1 as the source is equivalent to hashclr. In addition to being simpler, using hashclr may give the compiler a chance to optimize better. Convert instances of hashcpy with the source argument of null_sha1 to hashclr. This transformation was implemented using the following semantic patch: @@ expression E1; @@ -hashcpy(E1, null_sha1); +hashclr(E1); Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
Johannes Schindelin	afc676f2c9	diff: do not color output when --color=auto and --output=<file> is given "git diff --output=<file> --color=auto" used to show the ANSI color sequence in the resulting file when the standard output is connected to a terminal, because --color=auto check always checks the standard output, not the actual file that receives the output. We could correct this by using freopen(3) to redirect the standard output to the specified file, which is in like with how format-patch used to match the world order, but following the same reasoning as the earlier "format-patch: explicitly switch off color when writing to files", let's be more strict by bypassing the "auto" check when the --output=<file> option is in use. Strictly speaking, this is a backwards-incompatible change, but it is highly unlikely that any user would want to see ANSI color sequences in a file. The reason this was not caught earlier is most likely that either --output=<file> is not used, or only when stdout is redirected anyway. Users can still give --color=always if they want a colored diff in the resulting file. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:26:47 -07:00
Junio C Hamano	e5f7675544	Merge branch 'jk/diff-compact-heuristic' It turns out that the earlier effort to update the heuristics may want to use a bit more time to mature. Turn it off by default. * jk/diff-compact-heuristic: diff: disable compaction heuristic for now	2016-06-10 15:26:06 -07:00
Junio C Hamano	5580b271af	diff: disable compaction heuristic for now http://lkml.kernel.org/g/20160610075043.GA13411@sigill.intra.peff.net reports that a change to add a new "function" with common ending with the existing one at the end of the file is shown like this: def foo do_foo_stuff() + common_ending() +end + +def bar + do_bar_stuff() + common_ending() end when the new heuristic is in use. In reality, the change is to add the blank line before "def bar" and everything below, which is what the code without the new heuristic shows. Disable the heuristics by default, and resurrect the documentation for the option and the configuration variables, while clearly marking the feature as still experimental. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-10 13:45:23 -07:00
Junio C Hamano	0018da1088	Merge branch 'jk/diff-compact-heuristic' Patch output from "git diff" and friends has been tweaked to be more readable by using a blank line as a strong hint that the contents before and after it belong to a logically separate unit. * jk/diff-compact-heuristic: diff: undocument the compaction heuristic knobs for experimentation xdiff: implement empty line chunk heuristic xdiff: add recs_match helper function	2016-05-06 14:45:46 -07:00
Stefan Beller	d634d61ed6	xdiff: implement empty line chunk heuristic In order to produce the smallest possible diff and combine several diff hunks together, we implement a heuristic from GNU Diff which moves diff hunks forward as far as possible when we find common context above and below a diff hunk. This sometimes produces less readable diffs when writing C, Shell, or other programming languages, ie: ... /* + * + * + / + +/ ... instead of the more readable equivalent of ... +/* + * + * + / + / ... Implement the following heuristic to (optionally) produce the desired output. If there are diff chunks which can be shifted around, shift each hunk such that the last common empty line is below the chunk with the rest of the context above. This heuristic appears to resolve the above example and several other common issues without producing significantly weird results. However, as with any heuristic it is not really known whether this will always be more optimal. Thus, it can be disabled via diff.compactionHeuristic. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-19 10:53:34 -07:00
Junio C Hamano	5d2a30d7d8	Merge branch 'mm/diff-renames-default' The end-user facing Porcelain level commands like "diff" and "log" now enables the rename detection by default. * mm/diff-renames-default: diff: activate diff.renames by default log: introduce init_log_defaults() t: add tests for diff.renames (true/false/unset) t4001-diff-rename: wrap file creations in a test Documentation/diff-config: fix description of diff.renames	2016-04-03 10:29:22 -07:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Junio C Hamano	3ed26a44b3	Merge branch 'jk/more-comments-on-textconv' The memory ownership rule of fill_textconv() API, which was a bit tricky, has been documented a bit better. * jk/more-comments-on-textconv: diff: clarify textconv interface	2016-02-26 13:37:15 -08:00
Matthieu Moy	5404c116aa	diff: activate diff.renames by default Rename detection is a very convenient feature, and new users shouldn't have to dig in the documentation to benefit from it. Potential objections to activating rename detection are that it sometimes fail, and it is sometimes slow. But rename detection is already activated by default in several cases like "git status" and "git merge", so activating diff.renames does not fundamentally change the situation. When the rename detection fails, it now fails consistently between "git diff" and "git status". This setting does not affect plumbing commands, hence well-written scripts will not be affected. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-25 11:31:02 -08:00
Jeff King	b1ddfb9151	diff_populate_gitlink: use a strbuf We allocate 100 bytes to hold the "Submodule commit ..." text. This is enough, but it's not immediately obvious that this is the case, and we have to repeat the magic 100 twice. We could get away with xstrfmt here, but we want to know the size, as well, so let's use a real strbuf. And while we're here, we can clean up the logic around size_only. It currently sets and clears the "data" field pointlessly, and leaves the "should_free" flag on even after we have cleared the data. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Jeff King	96ffc06f72	convert trivial cases to FLEX_ARRAY macros Using FLEX_ARRAY macros reduces the amount of manual computation size we have to do. It also ensures we don't overflow size_t, and it makes sure we write the same number of bytes that we allocated. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Jeff King	a64e6a44c6	diff: clarify textconv interface The memory allocation scheme for the textconv interface is a bit tricky, and not well documented. It was originally designed as an internal part of diff.c (matching fill_mmfile), but gradually was made public. Refactoring it is difficult, but we can at least improve the situation by documenting the intended flow and enforcing it with an in-code assertion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 10:40:35 -08:00
Junio C Hamano	02dab5d399	Merge branch 'nd/diff-with-path-params' into maint A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-05 14:54:15 -08:00
Junio C Hamano	c167a96e68	Merge branch 'nd/diff-with-path-params' A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-03 14:16:04 -08:00
Duy Nguyen	a97262c62f	diff: make -O and --output work in subdirectory Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-21 10:45:13 -08:00
Junio C Hamano	433cc7e3fb	Merge branch 'tk/sigchain-unnecessary-post-tempfile' Remove no-longer used #include. * tk/sigchain-unnecessary-post-tempfile: shallow: remove unused #include "sigchain.h" read-cache: remove unused #include "sigchain.h" diff: remove unused #include "sigchain.h" credential-cache--daemon: remove unused #include "sigchain.h"	2015-10-29 13:59:18 -07:00
Tobias Klauser	086ecab1a7	diff: remove unused #include "sigchain.h" After switching to use the tempfile module in commit `284098f1` (diff: use tempfile module), no declarations from sigchain.h are used in diff.c anymore. Thus, remove the #include. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-22 11:12:37 -07:00
Junio C Hamano	78891795df	Merge branch 'jk/war-on-sprintf' Many allocations that is manually counted (correctly) that are followed by strcpy/sprintf have been replaced with a less error prone constructs such as xstrfmt. Macintosh-specific breakage was noticed and corrected in this reroll. * jk/war-on-sprintf: (70 commits) name-rev: use strip_suffix to avoid magic numbers use strbuf_complete to conditionally append slash fsck: use for_each_loose_file_in_objdir Makefile: drop D_INO_IN_DIRENT build knob fsck: drop inode-sorting code convert strncpy to memcpy notes: document length of fanout path with a constant color: add color_set helper for copying raw colors prefer memcpy to strcpy help: clean up kfmclient munging receive-pack: simplify keep_arg computation avoid sprintf and strcpy with flex arrays use alloc_ref rather than hand-allocating "struct ref" color: add overflow checks for parsing colors drop strcpy in favor of raw sha1_to_hex use sha1_to_hex_r() instead of strcpy daemon: use cld->env_array when re-spawning stat_tracking_info: convert to argv_array http-push: use an argv_array for setup_revisions fetch-pack: use argv_array for index-pack / unpack-objects ...	2015-10-20 15:24:01 -07:00
Jeff King	d59f765ac9	use sha1_to_hex_r() instead of strcpy Before sha1_to_hex_r() existed, a simple way to get hex sha1 into a buffer was with: strcpy(buf, sha1_to_hex(sha1)); This isn't wrong (assuming the buf is 41 characters), but it makes auditing the code base for bad strcpy() calls harder, as these become false positives. Let's convert them to sha1_to_hex_r(), and likewise for some calls to find_unique_abbrev(). While we're here, we'll double-check that all of the buffers are correctly sized, and use the more obvious GIT_SHA1_HEXSZ constant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-05 11:08:05 -07:00
Junio C Hamano	3adc4ec7b9	Sync with v2.5.4	2015-09-28 19:16:54 -07:00
Junio C Hamano	11a458befc	Sync with 2.4.10	2015-09-28 15:33:56 -07:00
Junio C Hamano	6343e2f6f2	Sync with 2.3.10	2015-09-28 15:28:31 -07:00
Jeff King	3efb988098	react to errors in xdi_diff When we call into xdiff to perform a diff, we generally lose the return code completely. Typically by ignoring the return of our xdi_diff wrapper, but sometimes we even propagate that return value up and then ignore it later. This can lead to us silently producing incorrect diffs (e.g., "git log" might produce no output at all, not even a diff header, for a content-level diff). In practice this does not happen very often, because the typical reason for xdiff to report failure is that it malloc() failed (it uses straight malloc, and not our xmalloc wrapper). But it could also happen when xdiff triggers one our callbacks, which returns an error (e.g., outf() in builtin/rerere.c tries to report a write failure in this way). And the next patch also plans to add more failure modes. Let's notice an error return from xdiff and react appropriately. In most of the diff.c code, we can simply die(), which matches the surrounding code (e.g., that is what we do if we fail to load a file for diffing in the first place). This is not that elegant, but we are probably better off dying to let the user know there was a problem, rather than simply generating bogus output. We could also just die() directly in xdi_diff, but the callers typically have a bit more context, and can provide a better message (and if we do later decide to pass errors up, we're one step closer to doing so). There is one interesting case, which is in diff_grep(). Here if we cannot generate the diff, there is nothing to match, and we silently return "no hits". This is actually what the existing code does already, but we make it a little more explicit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-09-28 14:57:10 -07:00
Jeff King	5096d4909f	convert trivial sprintf / strcpy calls to xsnprintf We sometimes sprintf into fixed-size buffers when we know that the buffer is large enough to fit the input (either because it's a constant, or because it's numeric input that is bounded in size). Likewise with strcpy of constant strings. However, these sites make it hard to audit sprintf and strcpy calls for buffer overflows, as a reader has to cross-reference the size of the array with the input. Let's use xsnprintf instead, which communicates to a reader that we don't expect this to overflow (and catches the mistake in case we do). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-09-25 10:18:18 -07:00
Junio C Hamano	5a4f07b322	Merge branch 'hv/submodule-config' The gitmodules API accessed from the C code learned to cache stuff lazily. * hv/submodule-config: submodule: allow erroneous values for the fetchRecurseSubmodules option submodule: use new config API for worktree configurations submodule: extract functions for config set and lookup submodule: implement a config API for lookup of .gitmodules values	2015-08-31 15:38:52 -07:00
Junio C Hamano	db86e61cbb	Merge branch 'mh/tempfile' The "lockfile" API has been rebuilt on top of a new "tempfile" API. * mh/tempfile: credential-cache--daemon: use tempfile module credential-cache--daemon: delete socket from main() gc: use tempfile module to handle gc.pid file lock_repo_for_gc(): compute the path to "gc.pid" only once diff: use tempfile module setup_temporary_shallow(): use tempfile module write_shared_index(): use tempfile module register_tempfile(): new function to handle an existing temporary file tempfile: add several functions for creating temporary files prepare_tempfile_object(): new function, extracted from create_tempfile() tempfile: a new module for handling temporary files commit_lock_file(): use get_locked_file_path() lockfile: add accessor get_lock_file_path() lockfile: add accessors get_lock_file_fd() and get_lock_file_fp() create_bundle(): duplicate file descriptor to avoid closing it twice lockfile: move documentation to lockfile.h and lockfile.c	2015-08-25 14:57:09 -07:00
Heiko Voigt	851e18c385	submodule: use new config API for worktree configurations We remove the extracted functions and directly parse into and read out of the cache. This allows us to have one unified way of accessing submodule configuration values specific to single submodules. Regardless whether we need to access a configuration from history or from the worktree. Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-19 11:43:10 -07:00
Michael Haggerty	284098f13f	diff: use tempfile module Also add some code comments explaining how the fields in "struct diff_tempfile" are used. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-12 14:49:43 -07:00
Junio C Hamano	2dded96052	Merge branch 'dt/log-follow-config' Add a new configuration variable to enable "--follow" automatically when "git log" is run with one pathspec argument. * dt/log-follow-config: log: add "log.follow" configuration variable	2015-08-03 11:01:20 -07:00
Junio C Hamano	abecddea25	Merge branch 'jc/diff-ws-error-highlight' A hotfix to a new feature in 2.5.0-rc. * jc/diff-ws-error-highlight: diff: parse ws-error-highlight option more strictly	2015-07-15 12:30:14 -07:00
René Scharfe	3f4f17b51b	diff: parse ws-error-highlight option more strictly Check if a matched token is followed by a delimiter before advancing the pointer arg. This avoids accepting composite words like "allnew" or "defaultcontext" and misparsing them as "new" or "context". Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-07-12 09:55:23 -07:00
David Turner	076c98372e	log: add "log.follow" configuration variable People who work on projects with mostly linear history with frequent whole file renames may want to always use "git log --follow" when inspecting the life of the content that live in a single path. Teach the command to behave as if "--follow" was given from the command line when log.follow configuration variable is set and there is one (and only one) path on the command line. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-07-09 10:24:23 -07:00
Junio C Hamano	6998d890c7	Merge branch 'jk/color-diff-plain-is-context' into maint "color.diff.plain" was a misnomer; give it 'color.diff.context' as a more logical synonym. * jk/color-diff-plain-is-context: diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT diff: accept color.diff.context as a synonym for "plain"	2015-06-25 11:02:11 -07:00
Junio C Hamano	db65170ee5	Merge branch 'jk/color-diff-plain-is-context' "color.diff.plain" was a misnomer; give it 'color.diff.context' as a more logical synonym. * jk/color-diff-plain-is-context: diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT diff: accept color.diff.context as a synonym for "plain"	2015-06-11 09:29:53 -07:00
Junio C Hamano	709cd912d4	Merge branch 'jc/diff-ws-error-highlight' Allow whitespace breakages in deleted and context lines to be also painted in the output. * jc/diff-ws-error-highlight: diff.c: --ws-error-highlight=<kind> option diff.c: add emit_del_line() and emit_context_line() t4015: separate common setup and per-test expectation t4015: modernise style	2015-06-11 09:29:51 -07:00
Jeff King	8dbf3eb685	diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT The latter is a much more descriptive name (and we support "color.diff.context" now). This also updates the name of any local variables which were used to store the color. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-27 13:54:42 -07:00
Jeff King	74b15bfbf6	diff: accept color.diff.context as a synonym for "plain" The term "plain" is a bit ambiguous; let's allow the more specific "context", but keep "plain" around for compatibility. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-27 13:54:37 -07:00
Junio C Hamano	b8767f791c	diff.c: --ws-error-highlight=<kind> option Traditionally, we only cared about whitespace breakages introduced in new lines. Some people want to paint whitespace breakages on old lines, too. When they see a whitespace breakage on a new line, they can spot the same kind of whitespace breakage on the corresponding old line and want to say "Ah, those breakages are there but they were inherited from the original, so let's not touch them for now." Introduce `--ws-error-highlight=<kind>` option, that lets them pass a comma separated list of `old`, `new`, and `context` to specify what lines to highlight whitespace errors on. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-26 23:00:01 -07:00
Junio C Hamano	0e383e185a	diff.c: add emit_del_line() and emit_context_line() Traditionally, we only had emit_add_line() helper, which knows how to find and paint whitespace breakages on the given line, because we only care about whitespace breakages introduced in new lines. The context lines and old (i.e. deleted) lines are emitted with a simpler emit_line_0() that paints the entire line in plain or old colors. Identify callers of emit_line_0() that show deleted lines and context lines, have them call new helpers, emit_del_line() and emit_context_line(), so that we can later tweak what is done to these two classes of lines. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-26 22:13:02 -07:00
Junio C Hamano	a393c6bfd9	Merge branch 'rs/deflate-init-cleanup' into maint Code simplification. * rs/deflate-init-cleanup: zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}	2015-03-23 11:23:38 -07:00
Junio C Hamano	6902c4da58	Merge branch 'rs/deflate-init-cleanup' Code simplification. * rs/deflate-init-cleanup: zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}	2015-03-17 16:01:26 -07:00
Junio C Hamano	a4b4f9b8e3	Merge branch 'mk/diff-shortstat-dirstat-fix' into maint "git diff --shortstat --dirstat=changes" showed a dirstat based on lines that was never asked by the end user in addition to the dirstat that the user asked for. * mk/diff-shortstat-dirstat-fix: diff --shortstat --dirstat: remove duplicate output	2015-03-13 22:56:04 -07:00
Junio C Hamano	b6488fe191	Merge branch 'mk/diff-shortstat-dirstat-fix' "git diff --shortstat --dirstat=changes" showed a dirstat based on lines that was never asked by the end user in addition to the dirstat that the user asked for. * mk/diff-shortstat-dirstat-fix: diff --shortstat --dirstat: remove duplicate output	2015-03-06 15:02:29 -08:00
René Scharfe	9a6f1287fb	zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw} Clear the git_zstream variable at the start of git_deflate_init() etc. so that callers don't have to do that. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-05 15:46:03 -08:00
Mårten Kongstad	ab27389aff	diff --shortstat --dirstat: remove duplicate output When --shortstat is used in conjunction with --dirstat=changes, git diff will output the dirstat information twice: first as calculated by the 'lines' algorithm, then as calculated by the 'changes' algorithm: $ git diff --dirstat=changes,10 --shortstat v2.2.0..v2.2.1 23 files changed, 453 insertions(+), 54 deletions(-) 33.5% Documentation/RelNotes/ 26.2% t/ 46.6% Documentation/RelNotes/ 16.6% t/ The same duplication happens for --shortstat together with --dirstat=files, but not for --shortstat together with --dirstat=lines. Limit output to only include one dirstat part, calculated as specified by the --dirstat parameter. Also, add test for this. Signed-off-by: Mårten Kongstad <marten.kongstad@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-02 11:31:27 -08:00
Junio C Hamano	b946576839	Merge branch 'jn/parse-config-slot' Code cleanup. * jn/parse-config-slot: color_parse: do not mention variable name in error message pass config slots as pointers instead of offsets	2014-10-20 12:23:48 -07:00
Jeff King	f6c5a2968c	color_parse: do not mention variable name in error message Originally the color-parsing function was used only for config variables. It made sense to pass the variable name so that the die() message could be something like: $ git -c color.branch.plain=bogus branch fatal: bad color value 'bogus' for variable 'color.branch.plain' These days we call it in other contexts, and the resulting error messages are a little confusing: $ git log --pretty='%C(bogus)' fatal: bad color value 'bogus' for variable '--pretty format' $ git config --get-color foo.bar bogus fatal: bad color value 'bogus' for variable 'command line' This patch teaches color_parse to complain only about the value, and then return an error code. Config callers can then propagate that up to the config parser, which mentions the variable name. Other callers can provide a custom message. After this patch these three cases now look like: $ git -c color.branch.plain=bogus branch error: invalid color value: bogus fatal: unable to parse 'color.branch.plain' from command-line config $ git log --pretty='%C(bogus)' error: invalid color value: bogus fatal: unable to parse --pretty format $ git config --get-color foo.bar bogus error: invalid color value: bogus fatal: unable to parse default color value Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-14 11:01:21 -07:00
Junio C Hamano	bedd3b4b7b	Merge branch 'nd/large-blobs' Teach a few codepaths to punt (instead of dying) when large blobs that would not fit in core are involved in the operation. * nd/large-blobs: diff: shortcut for diff'ing two binary SHA-1 objects diff --stat: mark any file larger than core.bigfilethreshold binary diff.c: allow to pass more flags to diff_populate_filespec sha1_file.c: do not die failing to malloc in unpack_compressed_entry wrapper.c: introduce gentle xmallocz that does not die()	2014-09-11 10:33:33 -07:00
René Scharfe	d318027932	run-command: introduce CHILD_PROCESS_INIT Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-20 09:53:37 -07:00
Nguyễn Thái Ngọc Duy	1aaf69e669	diff: shortcut for diff'ing two binary SHA-1 objects If we are given two SHA-1 and asked to determine if they are different (but not _what_ differences), we know right away by comparing SHA-1. A side effect of this patch is, because large files are marked binary, diff-tree will not need to unpack them. 'diff-index --cached' will not either. But 'diff-files' still does. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-18 10:16:55 -07:00
Nguyễn Thái Ngọc Duy	6bf3b81348	diff --stat: mark any file larger than core.bigfilethreshold binary Too large files may lead to failure to allocate memory. If it happens here, it could impact quite a few commands that involve diff. Moreover, too large files are inefficient to compare anyway (and most likely non-text), so mark them binary and skip looking at their content. Noticed-by: Dale R. Worley <worley@alum.mit.edu> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-18 10:16:45 -07:00
Nguyễn Thái Ngọc Duy	8e5dd3d654	diff.c: allow to pass more flags to diff_populate_filespec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-18 10:16:35 -07:00
Junio C Hamano	cfececfe1f	Merge branch 'bg/xcalloc-nmemb-then-size' into maint * bg/xcalloc-nmemb-then-size: transport-helper.c: rearrange xcalloc arguments remote.c: rearrange xcalloc arguments reflog-walk.c: rearrange xcalloc arguments pack-revindex.c: rearrange xcalloc arguments notes.c: rearrange xcalloc arguments imap-send.c: rearrange xcalloc arguments http-push.c: rearrange xcalloc arguments diff.c: rearrange xcalloc arguments config.c: rearrange xcalloc arguments commit.c: rearrange xcalloc arguments builtin/remote.c: rearrange xcalloc arguments builtin/ls-remote.c: rearrange xcalloc arguments	2014-07-22 10:25:17 -07:00
René Scharfe	cedc61a998	strbuf: use strbuf_addstr() for adding C strings Avoid code duplication and let strbuf_addstr() call strlen() for us. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-17 13:33:52 -07:00
Junio C Hamano	cb4575fb18	Merge branch 'jk/diff-follow-must-take-one-pathspec' into maint "git format-patch" did not enforce the rule that the "--follow" option from the log/diff family of commands must be used with exactly one pathspec. * jk/diff-follow-must-take-one-pathspec: move "--follow needs one pathspec" rule to diff_setup_done	2014-06-25 11:47:23 -07:00
Jeff King	0539cc0038	stat_opt: check extra strlen call As in earlier commits, the diff option parser uses starts_with to find that an argument starts with "--stat-", and then adds strlen("stat-") to find the rest of the option. However, in this case the starts_with and the strlen are separated across functions, making it easy to call the latter without the former. Let's use skip_prefix instead of raw pointer arithmetic to catch such a case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:45:19 -07:00
Jeff King	95b567c7c3	use skip_prefix to avoid repeating strings It's a common idiom to match a prefix and then skip past it with strlen, like: if (starts_with(foo, "bar")) foo += strlen("bar"); This avoids magic numbers, but means we have to repeat the string (and there is no compiler check that we didn't make a typo in one of the strings). We can use skip_prefix to handle this case without repeating ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:44:45 -07:00
Jeff King	ae021d8791	use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:44:45 -07:00
Jeff King	9e1a5ebe52	parse_diff_color_slot: drop ofs parameter This function originally took a whole config variable name ("var") and an offset ("ofs"). It checked "var+ofs" against each color slot, but reported errors using the whole "var". However, since `8b8e862` (ignore unknown color configuration, 2009-12-12), it returns -1 rather than printing its own error, and therefore only cares about var+ofs. We can drop the ofs parameter and teach its sole caller to derive the pointer itself. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-18 14:56:17 -07:00
Junio C Hamano	a634a6d209	Merge branch 'bg/xcalloc-nmemb-then-size' Like calloc(3), xcalloc() takes nmemb and then size. * bg/xcalloc-nmemb-then-size: transport-helper.c: rearrange xcalloc arguments remote.c: rearrange xcalloc arguments reflog-walk.c: rearrange xcalloc arguments pack-revindex.c: rearrange xcalloc arguments notes.c: rearrange xcalloc arguments imap-send.c: rearrange xcalloc arguments http-push.c: rearrange xcalloc arguments diff.c: rearrange xcalloc arguments config.c: rearrange xcalloc arguments commit.c: rearrange xcalloc arguments builtin/remote.c: rearrange xcalloc arguments builtin/ls-remote.c: rearrange xcalloc arguments	2014-06-16 12:17:50 -07:00
Junio C Hamano	b0e2c999af	Merge branch 'jk/diff-follow-must-take-one-pathspec' * jk/diff-follow-must-take-one-pathspec: move "--follow needs one pathspec" rule to diff_setup_done	2014-06-16 10:07:09 -07:00
Junio C Hamano	6779e43b0d	Merge branch 'jk/external-diff-use-argv-array' Code clean-up (and a bugfix which has been merged for 2.0). * jk/external-diff-use-argv-array: run_external_diff: refactor cmdline setup logic run_external_diff: hoist common bits out of conditional run_external_diff: drop fflush(NULL) run_external_diff: clean up error handling run_external_diff: use an argv_array for the environment	2014-06-03 12:06:43 -07:00
Junio C Hamano	8eaf517835	Merge branch 'ks/tree-diff-nway' Instead of running N pair-wise diff-trees when inspecting a N-parent merge, find the set of paths that were touched by walking N+1 trees in parallel. These set of paths can then be turned into N pair-wise diff-tree results to be processed through rename detections and such. And N=2 case nicely degenerates to the usual 2-way diff-tree, which is very nice. * ks/tree-diff-nway: mingw: activate alloca combine-diff: speed it up, by using multiparent diff tree-walker directly tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Portable alloca for Git tree-diff: reuse base str(buf) memory on sub-tree recursion tree-diff: no need to call "full" diff_tree_sha1 from show_path() tree-diff: rework diff_tree interface to be sha1 based tree-diff: diff_tree() should now be static tree-diff: remove special-case diff-emitting code for empty-tree cases tree-diff: simplify tree_entry_pathcmp tree-diff: show_path prototype is not needed anymore tree-diff: rename compare_tree_entry -> tree_entry_pathcmp tree-diff: move all action-taking code out of compare_tree_entry() tree-diff: don't assume compare_tree_entry() returns -1,0,1 tree-diff: consolidate code for emitting diffs and recursion in one place tree-diff: show_tree() is not needed tree-diff: no need to pass match to skip_uninteresting() tree-diff: no need to manually verify that there is no mode change for a path combine-diff: move changed-paths scanning logic into its own function combine-diff: move show_log_first logic/action out of paths scanning	2014-06-03 12:06:40 -07:00
Brian Gesiak	1a4927c5c5	diff.c: rearrange xcalloc arguments xcalloc() takes two arguments: the number of elements and their size. diffstat_add() passes the arguments in reverse order, passing the size of a diffstat_file, followed by the number of diffstat_file to be allocated. Rearrange them so they are in the correct order. Signed-off-by: Brian Gesiak <modocache@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-27 14:02:03 -07:00
Jeff King	dd63f169d9	move "--follow needs one pathspec" rule to diff_setup_done Because of the way "--follow" is implemented, we must have exactly one pathspec. "git log" enforces this restriction, but other users of the revision traversal code do not. For example, "git format-patch --follow" will segfault during try_to_follow_renames, as we have no pathspecs at all. We can push this check down into diff_setup_done, which is probably a better place anyway. It is the diff code that introduces this restriction, so other parts of the code should not need to care themselves. Reported-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-20 11:09:03 -07:00
Junio C Hamano	5f11a7aad0	Merge branch 'jk/external-diff-use-argv-array' (early part) Crash fix for codepath that miscounted the necessary size for an array when spawning an external diff program. * 'jk/external-diff-use-argv-array' (early part): run_external_diff: use an argv_array for the command line	2014-04-28 15:47:35 -07:00
Jeff King	f3efe78782	run_external_diff: refactor cmdline setup logic The current logic makes it hard to see what gets put onto the command line in which cases. Pulling out a helper function lets us see that we have two sets of file data, and the second set either uses the original name, or the "other" renamed/copy name. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-21 10:32:19 -07:00
Jeff King	0d4217d92e	run_external_diff: hoist common bits out of conditional Whether we have diff_filespecs to give to the diff command or not, we always are going to run the program and pass it the pathname. Let's pull that duplicated part out of the conditional to make it more obvious. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-21 10:32:07 -07:00
Jeff King	5b88caa417	run_external_diff: drop fflush(NULL) This fflush was added in `d5535ec` (Use run_command() to spawn external diff programs instead of fork/exec., 2007-10-19), because flushing buffers before forking is a good habit. But later, `7d0b18a` (Add output flushing before fork(), 2008-08-04) added it to the generic run-command interface, meaning that our flush here is redundant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-21 10:31:51 -07:00
Jeff King	89294d143d	run_external_diff: clean up error handling When the external diff reports an error, we try to clean up and die. However, we can make this process a bit simpler: 1. We do not need to bother freeing memory, since we are about to exit. Nor do we need to clean up our tempfiles, since the atexit() handler will do it for us. So we can die as soon as we see the error. 3. We can just call die() rather than fprintf/exit. This does technically change our exit code, but the exit code of "1" is not meaningful here. In fact, it is probably wrong, since "1" from diff usually means "completed successfully, but there were differences". And while we're there, we can mark the error message for translation, and drop the full stop at the end to make it more like our other messages. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-21 10:31:36 -07:00
Jeff King	ae049c955c	run_external_diff: use an argv_array for the environment We currently use static buffers and a static array for formatting the environment passed to the external diff. There's nothing wrong in the code, but it is much easier to verify that it is correct if we use a dynamic argv_array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-21 10:30:33 -07:00
Jeff King	82fbf269b9	run_external_diff: use an argv_array for the command line We currently generate the command-line for the external command using a fixed-length array of size 10. But if there is a rename, we actually need 11 elements (10 items, plus a NULL), and end up writing a random NULL onto the stack. Rather than bump the limit, let's just use an argv_array, which makes this sort of error impossible. Noticed-by: Max L <infthi.inbox@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-21 10:29:50 -07:00
Jiang Xin	d1d96a82bb	i18n: remove obsolete comments for translators in diffstat generation Since we do not translate diffstat any more, remove the obsolete comments. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-17 11:09:56 -07:00
Junio C Hamano	d59c12d7ad	Merge branch 'jl/nor-or-nand-and' Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"	2014-04-08 12:00:28 -07:00
Kirill Smelkov	7195fbfaf5	combine-diff: speed it up, by using multiparent diff tree-walker directly As was recently shown in "combine-diff: optimize combine_diff_path sets intersection", combine-diff runs very slowly. In that commit we optimized paths sets intersection, but that accounted only for ~ 25% of the slowness, and as my tracing showed, for linux.git v3.10..v3.11, for merges a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). In previous commit, we described the problem in more details, and reworked the diff tree-walker to be general one - i.e. to work in multiple parent case too. Now is the time to take advantage of it for finding paths for combine diff. The implementation is straightforward - if we know, we can get generated diff paths directly, and at present that means no diff filtering or rename/copy detection was requested(), we can call multiparent tree-walker directly and get ready paths. () because e.g. at present, all diffcore transformations work on diff_filepair queues, but in the future, that limitation can be lifted, if filters would operate directly on combine_diff_paths. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c log -c --merges before 1.9s 16.4s 15.2s after 1.9s 2.4s 1.1s The result stayed the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-07 14:41:49 -07:00
Kirill Smelkov	72441af7c4	tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-07 14:40:46 -07:00
Justin Lebar	01689909eb	comments: fix misuses of "nor" Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-31 15:29:27 -07:00
Junio C Hamano	a5aca6e883	Merge branch 'tr/diff-submodule-no-reuse-worktree' into maint "git diff --external-diff" incorrectly fed the submodule directory in the working tree to the external diff driver when it knew it is the same as one of the versions being compared. * tr/diff-submodule-no-reuse-worktree: diff: do not reuse_worktree_file for submodules	2014-03-18 14:03:41 -07:00
Junio C Hamano	34120a5fb5	Merge branch 'nd/diff-quiet-stat-dirty' into maint "git diff --quiet -- pathspec1 pathspec2" sometimes did not return correct status value. * nd/diff-quiet-stat-dirty: diff: do not quit early on stat-dirty files diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later	2014-03-18 13:59:56 -07:00
Junio C Hamano	6f75e48323	Merge branch 'rm/strchrnul-not-strlen' * rm/strchrnul-not-strlen: use strchrnul() in place of strchr() and strlen()	2014-03-18 13:51:18 -07:00
Junio C Hamano	fe9122a352	Merge branch 'dd/use-alloc-grow' Replace open-coded reallocation with ALLOC_GROW() macro. * dd/use-alloc-grow: sha1_file.c: use ALLOC_GROW() in pretend_sha1_file() read-cache.c: use ALLOC_GROW() in add_index_entry() builtin/mktree.c: use ALLOC_GROW() in append_to_tree() attr.c: use ALLOC_GROW() in handle_attr_line() dir.c: use ALLOC_GROW() in create_simplify() reflog-walk.c: use ALLOC_GROW() replace_object.c: use ALLOC_GROW() in register_replace_object() patch-ids.c: use ALLOC_GROW() in add_commit() diffcore-rename.c: use ALLOC_GROW() diff.c: use ALLOC_GROW() commit.c: use ALLOC_GROW() in register_commit_graft() cache-tree.c: use ALLOC_GROW() in find_subtree() bundle.c: use ALLOC_GROW() in add_to_ref_list() builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()	2014-03-18 13:50:21 -07:00
Junio C Hamano	481e6aaacc	Merge branch 'tr/diff-submodule-no-reuse-worktree' "git diff --external-diff" incorrectly fed the submodule directory in the working tree to the external diff driver when it knew it is the same as one of the versions being compared. * tr/diff-submodule-no-reuse-worktree: diff: do not reuse_worktree_file for submodules	2014-03-14 14:25:20 -07:00
Rohit Mani	2c5495f7b6	use strchrnul() in place of strchr() and strlen() Avoid scanning strings twice, once with strchr() and then with strlen(), by using strchrnul(). Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rohit Mani <rohit.mani@outlook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-10 08:35:30 -07:00
Junio C Hamano	2687ffdeb7	Merge branch 'jc/hold-diff-remove-q-synonym-for-no-deletion' Remove a confusing and deprecated "-q" option from "git diff-files"; "git diff-files --diff-filter=d" can be used instead.	2014-03-07 15:17:41 -08:00
Dmitry S. Dolzhenko	4c960a432c	diff.c: use ALLOC_GROW() Use ALLOC_GROW() instead of open-coding it in diffstat_add() and diff_q(). Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-03 14:48:39 -08:00
Junio C Hamano	1e745453fe	Merge branch 'nd/diff-quiet-stat-dirty' "git diff --quiet -- pathspec1 pathspec2" sometimes did not return correct status value. * nd/diff-quiet-stat-dirty: diff: do not quit early on stat-dirty files diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later	2014-02-27 14:01:21 -08:00
Nguyễn Thái Ngọc Duy	f34b205f6c	diff: do not quit early on stat-dirty files When QUICK is set (i.e. with --quiet) we try to do as little work as possible, stopping after seeing the first change. stat-dirty is considered a "change" but it may turn out not, if no actual content is changed. The actual content test is performed too late in the process and the shortcut may be taken prematurely, leading to incorrect return code. Assume we do "git diff --quiet". If we have a stat-dirty file "a" and a really dirty file "b". We break the loop in run_diff_files() and stop after "a" because we have got a "change". Later in diffcore_skip_stat_unmatch() we find out "a" is actually not changed. But there's nothing else in the diff queue, we incorrectly declare "no change", ignoring the fact that "b" is changed. This also happens to "git diff --quiet HEAD" when it hits diff_can_quit_early() in oneway_diff(). This patch does the content test earlier in order to keep going if "a" is unchanged. The test result is cached so that when diffcore_skip_stat_unmatch() is done in the end, we spend no cycles on re-testing "a". Reported-by: IWAMOTO Toshihiro <iwamoto@valinux.co.jp> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:50:14 -08:00
Nguyễn Thái Ngọc Duy	fceb907225	diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:50:03 -08:00
Thomas Rast	aba4727281	diff: do not reuse_worktree_file for submodules The GIT_EXTERNAL_DIFF calling code attempts to reuse existing worktree files for the worktree side of diffs, for performance reasons. However, that code also tries to do the same with submodules. This results in calls to $GIT_EXTERNAL_DIFF where the old-file is a file of the form "Submodule commit $sha1", but the new-file is a directory in the worktree. Fix it by never reusing a worktree "file" in the submodule case. Reported-by: Grégory Pakosz <gregory.pakosz@gmail.com> Signed-off-by: Thomas Rast <tr@thomasrast.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-18 12:06:08 -08:00
Junio C Hamano	e049109ef1	Merge branch 'jk/diff-filespec-cleanup' * jk/diff-filespec-cleanup: diff_filespec: use only 2 bits for is_binary flag diff_filespec: reorder is_binary field diff_filespec: drop xfrm_flags field diff_filespec: drop funcname_pattern_ident field diff_filespec: reorder dirty_submodule macro definitions	2014-01-27 10:45:03 -08:00
Jeff King	428d52a5a5	diff_filespec: drop xfrm_flags field The only mention of this field in the code is by some debugging code which prints it out (and it will always be zero, since we never touch it otherwise). It was obsoleted very early on by `25d5ea4` ([PATCH] Redo rename/copy detection logic., 2005-05-24). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-17 10:50:11 -08:00
Junio C Hamano	2da5cbd651	Merge branch 'sb/diff-orderfile-config' Allow "git diff -O<file>" to be configured with a new configuration variable. * sb/diff-orderfile-config: diff: add diff.orderfile configuration variable diff: let "git diff -O" read orderfile from any file and fail properly t4056: add new tests for "git diff -O"	2014-01-10 10:32:42 -08:00
Junio C Hamano	6904f9aa5b	Merge branch 'zk/difftool-counts' Show the total number of paths and the number of paths shown so far when "git difftool" prompts to launch an external diff tool, which would give users some sense of progress. * zk/difftool-counts: diff.c: fix some recent whitespace style violations difftool: display the number of files in the diff queue in the prompt	2013-12-27 14:58:13 -08:00
Samuel Bronson	6d8940b562	diff: add diff.orderfile configuration variable diff.orderfile acts as a default for the -O command line option. [sb: split up aw's original patch; rework tests and docs, treat option as pathname] Signed-off-by: Anders Waldenborg <anders@0x63.nu> Signed-off-by: Samuel Bronson <naesten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-18 16:39:00 -08:00
Junio C Hamano	ad70448576	Merge branch 'cc/starts-n-ends-with' Remove a few duplicate implementations of prefix/suffix comparison functions, and rename them to starts_with and ends_with. * cc/starts-n-ends-with: replace {pre,suf}fixcmp() with {starts,ends}_with() strbuf: introduce starts_with() and ends_with() builtin/remote: remove postfixcmp() and use suffixcmp() instead environment: normalize use of prefixcmp() by removing " != 0"	2013-12-17 12:02:44 -08:00
Jeff King	0ea7d5b6f8	diff.c: fix some recent whitespace style violations These were introduced by `ee7fb0b`. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-16 13:04:47 -08:00
Zoltan Klinger	ee7fb0b1d4	difftool: display the number of files in the diff queue in the prompt When --prompt option is set, git-difftool displays a prompt for each modified file to be viewed in an external diff program. At that point, it could be useful to display a counter and the total number of files in the diff queue. Below is the current difftool prompt for the first of 5 modified files: Viewing: 'diff.c' Launch 'vimdiff' [Y/n]: Consider the modified prompt: Viewing (1/5): 'diff.c' Launch 'vimdiff' [Y/n]: The current GIT_EXTERNAL_DIFF mechanism does not tell the number of paths in the diff queue nor the current counter. To make this "counter/total" info available for GIT_EXTERNAL_DIFF programs without breaking existing ones by doing the following: - Keep track of the number of paths shown so far in diff_options; - Export two new environment variables from run_external_diff() to show the total number of paths (from diff_queue_struct) and the current value of the counter (from diff_options); and - Update git-difftool--helper to use these two environment variables. Signed-off-by: Zoltan Klinger <zoltan.klinger@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-06 14:00:27 -08:00
Christian Couder	5955654823	replace {pre,suf}fixcmp() with {starts,ends}_with() Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c \| grep -v strbuf\\.c \| xargs perl -pi -e ' s\|!prefixcmp\(\|starts_with\(\|g; s\|prefixcmp\(\|!starts_with\(\|g; s\|!suffixcmp\(\|ends_with\(\|g; s\|suffixcmp\(\|!ends_with\(\|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-05 14:13:21 -08:00
Nicolas Vigier	b0d12fc9b2	Use the word 'stuck' instead of 'sticked' The past participle of 'stick' is 'stuck'. Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-31 15:47:38 -07:00
Junio C Hamano	4197361e39	Merge branch 'mg/more-textconv' Make "git grep" and "git show" pay attention to --textconv when dealing with blob objects. * mg/more-textconv: grep: honor --textconv for the case rev:path grep: allow to use textconv filters t7008: demonstrate behavior of grep with textconv cat-file: do not die on --textconv without textconv filters show: honor --textconv for blobs diff_opt: track whether flags have been set explicitly t4030: demonstrate behavior of show with textconv	2013-10-23 13:21:31 -07:00
Junio C Hamano	01a2a03c56	Merge branch 'jc/diff-filter-negation' Teach "git diff --diff-filter" to express "I do not want to see these classes of changes" more directly by listing only the unwanted ones in lowercase (e.g. "--diff-filter=d" will show everything but deletion) and deprecate "diff-files -q" which did the same thing as "--diff-filter=d". * jc/diff-filter-negation: diff: deprecate -q option to diff-files diff: allow lowercase letter to specify what change class to exclude diff: reject unknown change class given to --diff-filter diff: preparse --diff-filter string argument diff: factor out match_filter() diff: pass the whole diff_options to diffcore_apply_filter()	2013-09-09 14:28:35 -07:00
Stefan Beller	3b0c18af5c	diff: fix a possible null pointer dereference The condition in the ternary operator was wrong, hence the wrong char pointer could be used as the parameter for show_submodule_summary. one->path may be null, but we definitely need a non null path given to the function. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Acked-By: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-09 12:07:36 -07:00
Stefan Beller	c189c4f2c4	diff: remove ternary operator evaluating always to true The line being changed is deep inside the function builtin_diff. The variable name_b, which is used to evaluate the ternary expression must evaluate to true at that position, hence the replacement with just name_b. The name_b variable only occurs a few times in that lengthy function: As a parameter to the function itself: static void builtin_diff(const char name_a, const char name_b, ... The next occurrences are at: /* Never use a non-valid filename anywhere if at all possible / name_a = DIFF_FILE_VALID(one) ? name_a : name_b; name_b = DIFF_FILE_VALID(two) ? name_b : name_a; a_one = quote_two(a_prefix, name_a + (name_a == '/')); b_two = quote_two(b_prefix, name_b + (*name_b == '/')); In the last line of this block 'name_b' is dereferenced and compared to '/'. This would crash if name_b was NULL. Hence in the following code we can assume name_b being non-null. The next occurrence is just as a function argument, which doesn't change the memory, which name_b points to, so the assumption name_b being not null still holds: emit_rewrite_diff(name_a, name_b, one, two, textconv_one, textconv_two, o); The next occurrence would be the line of this patch. As name_b still must be not null, we can remove the ternary operator. Inside the emit_rewrite_diff function there is a also a line ecbdata.ws_rule = whitespace_rule(name_b ? name_b : name_a); which was also simplified as there is also a dereference before the ternary operator. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-09 12:05:16 -07:00
Junio C Hamano	0def7126fd	Merge branch 'ob/typofixes' * ob/typofixes: typofix: in-code comments typofix: documentation typofix: release notes	2013-07-24 19:23:01 -07:00
Junio C Hamano	0c544a22f9	Merge branch 'sb/misc-fixes' Assorted code cleanups and a minor fix. * sb/misc-fixes: diff.c: Do not initialize a variable, which gets reassigned anyway. commit: Fix a memory leak in determine_author_info daemon.c:handle: Remove unneeded check for null pointer.	2013-07-24 19:20:59 -07:00
Ondřej Bílka	749f763dbb	typofix: in-code comments Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-22 16:06:49 -07:00
Junio C Hamano	d3aeb31dc4	Merge branch 'nd/const-struct-cache-entry' * nd/const-struct-cache-entry: Convert "struct cache_entry *" to "const ..." wherever possible	2013-07-22 11:24:01 -07:00
Junio C Hamano	e2ecd252b5	Merge branch 'mm/diff-no-patch-synonym-to-s' "git show -s" was less discoverable than it should be. * mm/diff-no-patch-synonym-to-s: Documentation/git-log.txt: capitalize section names Documentation: move description of -s, --no-patch to diff-options.txt Documentation/git-show.txt: include common diff options, like git-log.txt diff: allow --patch & cie to override -s/--no-patch diff: allow --no-patch as synonym for -s t4000-diff-format.sh: modernize style	2013-07-22 11:23:27 -07:00
Junio C Hamano	c48f6816f0	diff: remove "diff-files -q" in a version of Git in a distant future This was inherited from "show-diff -q" that was invented to tell comparison between the index and the working tree to ignore only removals in 2005. These days, it is spelled as "--diff-filter=d". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-19 15:22:29 -07:00
Junio C Hamano	95a7c546b0	diff: deprecate -q option to diff-files This reimplements the ancient "-q" option to "git diff-files" that was inherited from "show-diff -q" in terms of "--diff-filter=d". We will be deprecating the "-q" option, so let's issue a warning when we do so. Incidentally this also tentatively fixes "git diff --no-index" to honor "-q" and hide deletions; the use will get the same warning. We should remove the support for "-q" in a future version but it is not that urgent. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-19 15:20:47 -07:00
Matthieu Moy	71482d389d	diff: allow --patch & cie to override -s/--no-patch All options that trigger a patch output now override --no-patch. The case of --binary deserves extra attention: the name may suggest that it turns a normal patch into a binary patch, but it actually already enables patch output when normally disabled (e.g. "git log --binary" displays a patch), hence it makes sense for "git show --no-patch --binary" to display the binary patch. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 17:50:56 -07:00
Matthieu Moy	d09cd15d19	diff: allow --no-patch as synonym for -s This follows the usual convention of having a --no-foo option to negate --foo. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 17:50:56 -07:00
Junio C Hamano	7f2ea5f0f2	diff: allow lowercase letter to specify what change class to exclude In order to express "we do not care about deletions", we had to say "--diff-filter=ACMRTXUB", giving all the possible change class except for the one we do not want, "D". This is cumbersome. As all the change classes are in uppercase, allow their lowercase counterpart to selectively exclude the class from the output. When such a negated change class is in the input, start the filter option with the full bits set. This would allow us to express the old "show-diff -q" with "git diff-files --diff-filter=d". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 17:17:39 -07:00
Junio C Hamano	bf142ec434	diff: reject unknown change class given to --diff-filter We used to accept "git diff --diff-filter=Q" (note that there is no such change class 'Q') silently and showed no output (because there is no such change class 'Q'). Error out when such an input is given. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 16:24:14 -07:00
Junio C Hamano	1ecc1cbd3a	diff: preparse --diff-filter string argument Instead of running strchr() on the list of status characters over and over again, parse the --diff-filter option into bitfields and use the bits to see if the change to the filepair matches the status requested. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 16:23:34 -07:00
Junio C Hamano	08578fa13e	diff: factor out match_filter() diffcore_apply_filter() checks if a filepair matches the filter given with the "--diff-filter" option for each input filepairs with a fairly complex expression in two places. Create a helper function and call it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 15:09:34 -07:00
Junio C Hamano	949226fe77	diff: pass the whole diff_options to diffcore_apply_filter() The --diff-filter=<arg> option given by the user is kept as a string, and passed to the underlying diffcore_apply_filter() function as a string for each resulting path we run number of strchr() to see if each class of change among ACDMRTXUB is meant to be given. Change the function signature to pass the whole diff_options, so that we can pre-parse this string in the next patch. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 14:19:24 -07:00
Stefan Beller	d3c9cf32ca	diff.c: Do not initialize a variable, which gets reassigned anyway. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 09:45:21 -07:00
Junio C Hamano	77f3c3f174	Merge branch 'jc/maint-diff-core-safecrlf' "git diff" refused to even show difference when core.safecrlf is set to true (i.e. error out) and there are offending lines in the working tree files. * jc/maint-diff-core-safecrlf: diff: demote core.safecrlf=true to core.safecrlf=warn	2013-07-11 13:05:45 -07:00
Nguyễn Thái Ngọc Duy	9c5e6c802c	Convert "struct cache_entry " to "const ..." wherever possible I attempted to make index_state->cache[] a "const struct cache_entry " to find out how existing entries in index are modified and where. The question I have is what do we do if we really need to keep track of on-disk changes in the index. The result is - diff-lib.c: setting CE_UPTODATE - name-hash.c: setting CE_HASHED - preload-index.c, read-cache.c, unpack-trees.c and builtin/update-index: obvious - entry.c: write_entry() may refresh the checked out entry via fill_stat_cache_info(). This causes "non-const struct cache_entry " in builtin/apply.c, builtin/checkout-index.c and builtin/checkout.c - builtin/ls-files.c: --with-tree changes stagemask and may set CE_UPDATE Of these, write_entry() and its call sites are probably most interesting because it modifies on-disk info. But this is stat info and can be retrieved via refresh, at least for porcelain commands. Other just uses ce_flags for local purposes. So, keeping track of "dirty" entries is just a matter of setting a flag in index modification functions exposed by read-cache.c. Except unpack-trees, the rest of the code base does not do anything funny behind read-cache's back. The actual patch is less valueable than the summary above. But if anyone wants to re-identify the above sites. Applying this patch, then this: diff --git a/cache.h b/cache.h index 430d021..1692891 100644 --- a/cache.h +++ b/cache.h @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode) #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1) struct index_state { - struct cache_entry cache; + const struct cache_entry cache; unsigned int version; unsigned int cache_nr, cache_alloc, cache_changed; struct string_list *resolve_undo; will help quickly identify them without bogus warnings. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-09 09:12:48 -07:00
Junio C Hamano	5430bb283b	diff: demote core.safecrlf=true to core.safecrlf=warn Otherwise the user will not be able to start to guess where in the contents in the working tree the offending unsafe CR lies. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-25 13:55:03 -07:00
Antoine Pelisse	36617af7ed	diff: add --ignore-blank-lines option The goal of the patch is to introduce the GNU diff -B/--ignore-blank-lines as closely as possible. The short option is not available because it's already used for "break-rewrites". When this option is used, git-diff will not create hunks that simply add or remove empty lines, but will still show empty lines addition/suppression if they are close enough to "valuable" changes. There are two differences between this option and GNU diff -B option: - GNU diff doesn't have "--inter-hunk-context", so this must be handled - The following sequence looks like a bug (context is displayed twice): $ seq 5 >file1 $ cat <<EOF >file2 change 1 2 3 4 5 change EOF $ diff -u -B file1 file2 --- file1 2013-06-08 22:13:04.471517834 +0200 +++ file2 2013-06-08 22:13:23.275517855 +0200 @@ -1,5 +1,7 @@ +change 1 2 + 3 4 5 @@ -3,3 +5,4 @@ 3 4 5 +change So here is a more thorough description of the option: - real changes are interesting - blank lines that are close enough (less than context size) to interesting changes are considered interesting (recursive definition) - "context" lines are used around each hunk of interesting changes - If two hunks are separated by less than "inter-hunk-context", they will be merged into one. The implementation does the "interesting changes selection" in a single pass. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-19 15:17:45 -07:00
Junio C Hamano	6c374008b1	diff_opt: track whether flags have been set explicitly The diff_opt infrastructure sets flags based on defaults and command line options. It is impossible to tell whether a flag has been set as a default or on explicit request. Update the structure so that this detection is possible: * Add an extra "opt->touched_flags" that keeps track of all the fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR. * You may continue setting the default values to the flags, like commands in the "log" family do in cmd_log_init_defaults(), but after you finished setting the defaults, you clear the touched_flags field; * And then you let the usual callchain call diff_opt_parse(), allowing the opt->flags be set or unset, while keeping track of which bits the user touched; * There is an optional callback "opt->set_default" that is called at the very beginning to let you inspect touched_flags and update opt->flags appropriately, before the remainder of the diffcore machinery is set up, taking the opt->flags value into account. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-10 10:24:17 -07:00
Junio C Hamano	e4d15959d4	Merge branch 'jk/diff-algo-finishing-touches' into maint "git diff --diff-algorithm=algo" was understood by the command line parser, but "git diff --diff-algorithm algo" was not. * jk/diff-algo-finishing-touches: diff: allow unstuck arguments with --diff-algorithm git-merge(1): document diff-algorithm option to merge-recursive	2013-04-24 16:19:42 -07:00
Junio C Hamano	f678d9b592	Merge branch 'jk/diff-graph-submodule-summary' Make "git diff --graph" work better with submodule log output. * jk/diff-graph-submodule-summary: submodule: print graph output next to submodule log	2013-04-15 12:41:01 -07:00
Junio C Hamano	825ccfc23c	Merge branch 'jk/diff-algo-finishing-touches' "git diff --diff-algorithm algo" is also understood as "git diff --diff-algorithm=algo". * jk/diff-algo-finishing-touches: diff: allow unstuck arguments with --diff-algorithm git-merge(1): document diff-algorithm option to merge-recursive	2013-04-15 12:40:58 -07:00
Stefano Lattarini	41ccfdd9c9	Correct common spelling mistakes in comments and tests Most of these were found using Lucas De Marchi's codespell tool. Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-12 13:38:40 -07:00
John Keeping	0f33a0677d	submodule: print graph output next to submodule log When running "git log -p --submodule=log", the submodule log is not indented by the graph output, although all other lines are. Fix this by prepending the current line prefix to each line of the submodule log. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-05 11:28:10 -07:00
John Keeping	0895c6d4c0	diff: allow unstuck arguments with --diff-algorithm The argument to --diff-algorithm is mandatory, so there is no reason to require the argument to be stuck to the option with '='. Change this for consistency with other Git commands. Note that this does not change the handling of diff-algorithm in merge-recursive.c since the primary interface to that is via the -X option to 'git merge' where the unstuck form does not make sense. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-05 11:01:08 -07:00
Junio C Hamano	b76a9e1648	Merge branch 'ap/maint-diff-rename-avoid-overlap' into maint * ap/maint-diff-rename-avoid-overlap: tests: make sure rename pretty print works diff: prevent pprint_rename from underrunning input diff: Fix rename pretty-print when suffix and prefix overlap	2013-04-01 09:19:47 -07:00
Junio C Hamano	caf217a3b8	Merge branch 'ap/maint-diff-rename-avoid-overlap' The logic used by "git diff -M --stat" to shorten the names of files before and after a rename did not work correctly when the common prefix and suffix between the two filenames overlapped. * ap/maint-diff-rename-avoid-overlap: tests: make sure rename pretty print works diff: prevent pprint_rename from underrunning input diff: Fix rename pretty-print when suffix and prefix overlap	2013-03-25 14:00:37 -07:00
Max Nanasy	c9fc4415e2	diff.c: diff.renamelimit => diff.renameLimit in message In the warning message printed when rename or unmodified copy detection was skipped due to too many files, change "diff.renamelimit" to "diff.renameLimit", in order to make it consistent with git documentation, which consistently uses "diff.renameLimit". Signed-off-by: Max Nanasy <max.nanasy@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-03-21 14:06:49 -07:00
Thomas Rast	dd281f09b7	diff: prevent pprint_rename from underrunning input The logic described in `d020e27` (diff: Fix rename pretty-print when suffix and prefix overlap, 2013-02-23) is wrong: The proof in the comment is valid only if both strings are the same length. One of old/new can reach a-1 (b-1, resp.) if 'a' is a suffix of 'b' (or vice versa). Since the intent was to let the loop run down to the '/' at the end of the common prefix, fix it by making that distinction explicit: if there is no prefix, allow no underrun. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-26 13:01:34 -08:00
Antoine Pelisse	d020e27fda	diff: Fix rename pretty-print when suffix and prefix overlap When considering a rename for two files that have a suffix and a prefix that can overlap, a confusing line is shown. As an example, renaming "a/b/b/c" to "a/b/c" shows "a/b/{ => }/b/c". Currently, what we do is calculate the common prefix ("a/b/"), and the common suffix ("/b/c"), but the same "/b/" is actually counted both in prefix and suffix. Then when calculating the size of the non-common part, we end-up with a negative value which is reset to 0, thus the "{ => }". Do not allow the common suffix to overlap the common prefix and stop when reaching a "/" that would be in both. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-23 23:52:39 -08:00
Junio C Hamano	abea4dc76a	Merge branch 'mp/diff-algo-config' Add diff.algorithm configuration so that the user does not type "diff --histogram". * mp/diff-algo-config: diff: Introduce --diff-algorithm command line option config: Introduce diff.algorithm variable git-completion.bash: Autocomplete --minimal and --histogram for git-diff	2013-02-17 15:25:52 -08:00
Junio C Hamano	a1d68bea89	Merge branch 'jk/diff-graph-cleanup' Refactors a lot of repetitive code sequence from the graph drawing code and adds it to the combined diff output. * jk/diff-graph-cleanup: combine-diff.c: teach combined diffs about line prefix diff.c: use diff_line_prefix() where applicable diff: add diff_line_prefix function diff.c: make constant string arguments const diff: write prefix to the correct file graph: output padding for merge subsequent parents	2013-02-14 10:29:59 -08:00
John Keeping	30997bb8f1	diff.c: use diff_line_prefix() where applicable Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-12 11:42:07 -08:00
John Keeping	f192223447	diff: add diff_line_prefix function This is a helper function to call the diff output_prefix function and return its value as a C string, allowing us to greatly simplify everywhere that needs to get the output prefix. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-12 11:42:07 -08:00
John Keeping	32b367e444	diff.c: make constant string arguments const Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-12 11:42:07 -08:00
John Keeping	3bf25c23cd	diff: write prefix to the correct file Write the prefix for an output line to the same file as the actual content. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-12 11:42:07 -08:00
Michal Privoznik	07924d4d50	diff: Introduce --diff-algorithm command line option Since command line options have higher priority than config file variables and taking previous commit into account, we need a way how to specify myers algorithm on command line. However, inventing `--myers` is not the right answer. We need far more general option, and that is `--diff-algorithm`. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-16 09:41:18 -08:00
Michal Privoznik	07ab4dec80	config: Introduce diff.algorithm variable Some users or projects prefer different algorithms over others, e.g. patience over myers or similar. However, specifying appropriate argument every time diff is to be used is impractical. Moreover, creating an alias doesn't play nicely with other tools based on diff (git-show for instance). Hence, a configuration variable which is able to set specific algorithm is needed. For now, these four values are accepted: 'myers' (which has the same effect as not setting the config variable at all), 'minimal', 'patience' and 'histogram'. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-16 09:37:45 -08:00
Junio C Hamano	90d0b8a9f0	Merge branch 'jc/blame-no-follow' Teaches "--no-follow" option to "git blame" to disable its whole-file rename detection. * jc/blame-no-follow: blame: pay attention to --no-follow diff: accept --no-follow option	2013-01-14 08:15:51 -08:00
Junio C Hamano	a4eab8f38e	Merge branch 'lt/diff-stat-show-0-lines' "git diff --stat" miscounted the total number of changed lines when binary files were involved and hidden beyond --stat-count. It also miscounted the total number of changed files when there were unmerged paths. * lt/diff-stat-show-0-lines: t4049: refocus tests diff --shortstat: do not count "unmerged" entries diff --stat: do not count "unmerged" entries diff --stat: move the "total count" logic to the last loop diff --stat: use "file" temporary variable to refer to data->files[i] diff --stat: status of unmodified pair in diff-q is not zero test: add failing tests for "diff --stat" to t4049	2012-11-29 12:53:54 -08:00
Junio C Hamano	20c8cde456	diff --shortstat: do not count "unmerged" entries Fix the same issue as the previous one for "git diff --stat"; unmerged entries was doubly-counted. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-27 14:19:36 -08:00
Junio C Hamano	82dfc2c44e	diff --stat: do not count "unmerged" entries Even though we show a separate UNMERGED entry in the patch and diffstat output (or in the --raw format, for that matter) in addition to and separately from the diff against the specified stage (defaulting to #2) for unmerged paths, they should not be counted in the total number of files affected---that would lead to counting the same path twice. The separation done by the previous step makes this fix simple and straightforward. Among the filepairs in diff_queue, paths that weren't modified, and the extra "unmerged" entries do not count as total number of files. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-27 13:21:15 -08:00
Junio C Hamano	a20d3c0de1	diff --stat: move the "total count" logic to the last loop The diffstat generation logic, with --stat-count limit, is implemented as three loops. - The first counts the width necessary to show stats up to specified number of entries, and notes up to how many entries in the data we need to iterate to show the graph; - The second iterates that many times to draw the graph, adjusts the number of "total modified files", and counts the total added/deleted lines for the part that was shown in the graph; - The third iterates over the remainder and only does the part to count "total added/deleted lines" and to adjust "total modified files" without drawing anything. Move the logic to count added/deleted lines and modified files from the second loop to the third loop. This incidentally fixes a bug. The third loop was not filtering binary changes (counted in bytes) from the total added/deleted as it should. The second loop implemented this correctly, so if a binary change appeared earlier than the --stat-count cutoff, the code counted number of added/deleted lines correctly, but if it appeared beyond the cutoff, the number of lines would have mixed with the byte count in the buggy third loop. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-27 13:21:15 -08:00
Junio C Hamano	af0ed819c5	diff --stat: use "file" temporary variable to refer to data->files[i] The generated code shouldn't change but it is easier to read. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-27 13:21:15 -08:00
Junio C Hamano	99bfd40700	diff --stat: status of unmodified pair in diff-q is not zero It is spelled DIFF_STATUS_UNKNOWN these days, and is different from zero. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-27 13:21:15 -08:00
Junio C Hamano	be95387af2	Merge branch 'rr/submodule-diff-config' Allow "git diff --submodule=log" to set to be the default via configuration. * rr/submodule-diff-config: submodule: display summary header in bold diff: rename "set" variable diff: introduce diff.submodule configuration variable Documentation: move diff.wordRegex from config.txt to diff-config.txt	2012-11-25 18:44:50 -08:00
Junio C Hamano	76c39289ba	Merge branch 'lt/diff-stat-show-0-lines' We failed to mention a file without any content change but whose permission bit was modified, or (worse yet) a new file without any content in the "git diff --stat" output. * lt/diff-stat-show-0-lines: Fix "git diff --stat" for interesting - but empty - file changes	2012-11-25 18:44:06 -08:00
Ramkumar Ramachandra	4e215131d2	submodule: display summary header in bold Currently, 'git diff --submodule' displays output with a bold diff header for non-submodules. So this part is in bold: diff --git a/file1 b/file1 index 30b2f6c..2638038 100644 --- a/file1 +++ b/file1 For submodules, the header looks like this: Submodule submodule1 012b072..248d0fd: Unfortunately, it's easy to miss in the output because it's not bold. Change this. Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-18 19:18:13 -08:00
Jeff King	d9c552f17a	diff: rename "set" variable Once upon a time the builtin_diff function used one color, and the color variables were called "set" and "reset". Nowadays it is a much longer function and we use several colors (e.g., "add", "del"). Rename "set" to "meta" to show that it is the color for showing diff meta-info (it still does not indicate that it is a "color", but at least it matches the scheme of the other color variables). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-18 19:18:13 -08:00
Ramkumar Ramachandra	c47ef57caa	diff: introduce diff.submodule configuration variable Introduce a diff.submodule configuration variable corresponding to the '--submodule' command-line option of 'git diff'. Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-11-18 19:18:13 -08:00
Jeff King	19fb613695	Merge branch 'nd/builtin-to-libgit' Code cleanups so that libgit.a does not depend on anything in the builtin/ directory. * nd/builtin-to-libgit: fetch-pack: move core code to libgit.a fetch-pack: remove global (static) configuration variable "args" send-pack: move core code to libgit.a Move setup_diff_pager to libgit.a Move print_commit_list to libgit.a Move estimate_bisect_steps to libgit.a Move try_merge_command and checkout_fast_forward to libgit.a	2012-11-09 12:51:06 -05:00

... 5 6 7 8 9 ...

1460 Commits