git-commit-vandalism

Author	SHA1	Message	Date
Stefan Beller	c1ddc4610c	diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit Currently flags for pickaxing are found in different places. Unify the flags into the `pickaxe_opts` field, which will contain any pickaxe related flags. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	929ed70a72	diff.h: make pickaxe_opts an unsigned bit field This variable is used as a bit field[1], and as we are about to add more fields, indicate its usage as a bit field by making it unsigned. [1] containing the bits #define DIFF_PICKAXE_ALL 1 #define DIFF_PICKAXE_REGEX 2 #define DIFF_PICKAXE_KIND_S 4 #define DIFF_PICKAXE_KIND_G 8 Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Jonathan Tan	2477ab2ea8	diff: support anchoring line(s) Teach diff a new algorithm, one that attempts to prevent user-specified lines from appearing as a deletion or addition in the end result. The end user can use this by specifying "--anchored=<text>" one or more times when using Git commands like "diff" and "show". Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-28 10:40:04 +09:00
Junio C Hamano	8cc633286a	Merge branch 'bw/diff-opt-impl-to-bitfields' A single-word "unsigned flags" in the diff options is being split into a structure with many bitfields. * bw/diff-opt-impl-to-bitfields: diff: make struct diff_flags members lowercase diff: remove DIFF_OPT_CLR macro diff: remove DIFF_OPT_SET macro diff: remove DIFF_OPT_TST macro diff: remove touched flags diff: add flag to indicate textconv was set via cmdline diff: convert flags to be stored in bitfields add, reset: use DIFF_OPT_SET macro to set a diff flag	2017-11-09 14:31:27 +09:00
Junio C Hamano	f4c214b529	Merge branch 'jk/revision-pruning-optim' Pathspec-limited revision traversal was taught not to keep finding unneeded differences once it knows two trees are different inside given pathspec. * jk/revision-pruning-optim: revision: quit pruning diff more quickly when possible	2017-11-06 14:24:26 +09:00
Brandon Williams	0d1e0e7801	diff: make struct diff_flags members lowercase Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:40 +09:00
Brandon Williams	b2100e5291	diff: remove DIFF_OPT_CLR macro Remove the `DIFF_OPT_CLR` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_CLR(&E, fld) + E.flags.fld = 0 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_CLR(ptr, fld) + ptr->flags.fld = 0 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:30 +09:00
Brandon Williams	23dcf77f48	diff: remove DIFF_OPT_SET macro Remove the `DIFF_OPT_SET` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_SET(&E, fld) + E.flags.fld = 1 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_SET(ptr, fld) + ptr->flags.fld = 1 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Brandon Williams	3b69daed86	diff: remove DIFF_OPT_TST macro Remove the `DIFF_OPT_TST` macro and instead access the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_TST(&E, fld) + E.flags.fld @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_TST(ptr, fld) + ptr->flags.fld Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Brandon Williams	25567af805	diff: remove touched flags Now that the set of parallel touched flags are no longer being used, remove them. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:02 +09:00
Brandon Williams	afa73c5384	diff: add flag to indicate textconv was set via cmdline git-show is unique in that it wants to use textconv by default except for when it is showing blobs. When asked to show a blob, show doesn't want to use textconv unless the user explicitly requested that it be used by providing the command line flag '--textconv'. Currently this is done by using a parallel set of 'touched' flags which get set every time a particular flag is set or cleared. In a future patch we want to eliminate this parallel set of flags so instead of relying on if the textconv flag has been touched, add a new flag 'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is set to true via the command line. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:02 +09:00
Brandon Williams	02f2f56bc3	diff: convert flags to be stored in bitfields We cannot add many more flags to the diff machinery due to the limitations of the number of flags that can be stored in a single unsigned int. In order to allow for more flags to be added to the diff machinery in the future this patch converts the flags to be stored in bitfields in 'struct diff_flags'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:02 +09:00
Jeff King	a937b37e76	revision: quit pruning diff more quickly when possible When the revision traversal machinery is given a pathspec, we must compute the parent-diff for each commit to determine which ones are TREESAME. We set the QUICK diff flag to avoid looking at more entries than we need; we really just care whether there are any changes at all. But there is one case where we want to know a bit more: if --remove-empty is set, we care about finding cases where the change consists only of added entries (in which case we may prune the parent in try_to_simplify_commit()). To cover that case, our file_add_remove() callback does not quit the diff upon seeing an added entry; it keeps looking for other types of entries. But this means when --remove-empty is not set (and it is not by default), we compute more of the diff than is necessary. You can see this in a pathological case where a commit adds a very large number of entries, and we limit based on a broad pathspec. E.g.: perl -e ' chomp(my $blob = `git hash-object -w --stdin </dev/null`); for my $a (1..1000) { for my $b (1..1000) { print "100644 $blob\t$a/$b\n"; } } ' \| git update-index --index-info git commit -qm add git rev-list HEAD -- . This case takes about 100ms now, but after this patch only needs 6ms. That's not a huge improvement, but it's easy to get and it protects us against even more pathological cases (e.g., going from 1 million to 10 million files would take ten times as long with the current code, but not increase at all after this patch). This is reported to minorly speed-up pathspec limiting in real world repositories (like the 100-million-file Windows repository), but probably won't make a noticeable difference outside of pathological setups. This patch actually covers the case without --remove-empty, and the case where we see only deletions. See the in-code comment for details. Note that we have to add a new member to the diff_options struct so that our callback can see the value of revs->remove_empty_trees. This callback parameter could be passed to the "add_remove" and "change" callbacks, but there's not much point. They already receive the diff_options struct, and doing it this way avoids having to update the function signature of the other callbacks (arguably the format_callback and output_prefix functions could benefit from the same simplification). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-14 11:43:49 +09:00
Jonathan Tan	f0b8fb6e59	diff: define block by number of alphanumeric chars The existing behavior of diff --color-moved=zebra does not define the minimum size of a block at all, instead relying on a heuristic applied later to filter out sets of adjacent moved lines that are shorter than 3 lines long. This can be confusing, because a block could thus be colored as moved at the source but not at the destination (or vice versa), depending on its neighbors. Instead, teach diff that the minimum size of a block is 20 alphanumeric characters, the same heuristic used by "git blame". This allows diff to still exclude uninteresting lines appearing on their own (such as those solely consisting of one or a few closing braces), as was the intention of the adjacent-moved-line heuristic. This requires a change in some tests in that some of their lines are no longer considered to be part of a block, because they are too short. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-16 11:44:00 -07:00
Stefan Beller	86b452e276	diff.c: add dimming to moved line detection Any lines inside a moved block of code are not interesting. Boundaries of blocks are only interesting if they are next to another block of moved code. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	176841f0c9	diff.c: color moved lines differently, plain mode Add the 'plain' mode for move detection of code. This omits the checking for adjacent blocks, so it is not as useful. If you have a lot of the same blocks moved in the same patch, the 'Zebra' would end up slow as it is O(n^2) (n is number of same blocks). So this may be useful there and is generally easy to add. Instead be very literal at the move detection, do not skip over short blocks here. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	2e2d5ac184	diff.c: color moved lines differently When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] - if (!is_authorized_user()) [OM] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OM] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NM] + sensitive_stuff(spanning, [NM] + multiple, [NM] + lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] - if (!is_authorized_user()) [OMA] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OMA] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] + sensitive_stuff(spanning, [NMA] + multiple, [NMA] + lines); [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. It is useful to have moved lines colored, but there are annoying corner cases, such as a single line moved, that is very common. For example in a typical patch of C code, we have closing braces that end statement blocks or functions. While it is technically true that these lines are moved as they show up elsewhere, it is harmful for the review as the reviewers attention is drawn to such a minor side annoyance. For now let's have a simple solution of hardcoding the number of moved lines to be at least 3 before coloring them. Note, that the length is applied across all blocks to find the 'lonely' blocks that pollute new code, but do not interfere with a permutated block where each permutation has less lines than 3. Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	e6e045f803	diff.c: buffer all output if asked to Introduce a new option 'emitted_symbols' in the struct diff_options which controls whether all output is buffered up until all output is available. It is set internally in diff.c when necessary. We'll have a new struct 'emitted_string' in diff.c which will be used to buffer each line. The emitted_string will duplicate the memory of the line to buffer as that is easiest to reason about for now. In a future patch we may want to decrease the memory usage by not duplicating all output for buffering but rather we may want to store offsets into the file or in case of hunk descriptions such as the similarity score, we could just store the relevant number and reproduce the text later on. This approach was chosen as a first step because it is quite simple compared to the alternative with less memory footprint. emit_diff_symbol factors out the emission part and depending on the diff_options->emitted_symbols the emission will be performed directly when calling emit_diff_symbol or after the whole process is done, i.e. by buffering we have add the possibility for a second pass over the whole output before doing the actual output. In `6440d34` (2012-03-14, diff: tweak a _copy_ of diff_options with word-diff) we introduced a duplicate diff options struct for word emissions as we may have different regex settings in there. When buffering the output, we need to operate on just one buffer, so we have to copy back the emissions of the word buffer into the main buffer. Unconditionally enable output via buffer in this patch as it yields a great opportunity for testing, i.e. all the diff tests from the test suite pass without having reordering issues (i.e. only parts of the output got buffered, and we forgot to buffer other parts). The test suite passes, which gives confidence that we converted all functions to use emit_string for output. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	0911c475c8	diff.c: convert show_stats to use emit_diff_symbol We call print_stat_summary from builtin/apply, so we still need the version with a file pointer, so introduce print_stat_summary_0 that uses emit_string machinery and keep print_stat_summary with the same arguments around. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	f3597138df	submodule.c: migrate diff output to use emit_diff_symbol As the submodule process is no longer attached to the same file pointer 'o->file' as the superprojects process, there is a different result in color.c::check_auto_color. That is why we need to pass coloring explicitly, such that the submodule coloring decision will be made by the child process processing the submodule. Only DIFF_SYMBOL_SUBMODULE_PIPETHROUGH contains color, the other symbols are for embedding the submodule output into the superprojects output. Remove the colors from the function signatures, as all the coloring decisions will be made either inside the child process or the final emit_diff_symbol, but not in the functions driving the submodule diff. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	091f8e28b4	diff.c: migrate emit_line_checked to use emit_diff_symbol Add a new flags field to emit_diff_symbol, that will be used by context lines for: * white space rules that are applicable (The first 12 bits) Take a note in cahe.c as well, when this ws rules are extended we have to fix the bits in the flags field. * how the rules are evaluated (actually this double encodes the sign of the line, but the code is easier to keep this way, bits 13,14,15) * if the line a blank line at EOF (bit 16) The check if new lines need to be marked up as extra lines at the end of file, is now done unconditionally. That should be ok, as 'new_blank_line_at_eof' has a quick early return. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Junio C Hamano	a6f38c109b	Merge branch 'bw/object-id' Conversion from uchar[20] to struct object_id continues. * bw/object-id: (33 commits) diff: rename diff_fill_sha1_info to diff_fill_oid_info diffcore-rename: use is_empty_blob_oid tree-diff: convert path_appendnew to object_id tree-diff: convert diff_tree_paths to struct object_id tree-diff: convert try_to_follow_renames to struct object_id builtin/diff-tree: cleanup references to sha1 diff-tree: convert diff_tree_sha1 to struct object_id notes-merge: convert write_note_to_worktree to struct object_id notes-merge: convert verify_notes_filepair to struct object_id notes-merge: convert find_notes_merge_pair_ps to struct object_id notes-merge: convert merge_from_diffs to struct object_id notes-merge: convert notes_merge* to struct object_id tree-diff: convert diff_root_tree_sha1 to struct object_id combine-diff: convert find_paths_* to struct object_id combine-diff: convert diff_tree_combined to struct object_id diff: convert diff_flush_patch_id to struct object_id patch-ids: convert to struct object_id diff: finish conversion for prepare_temp_file to struct object_id diff: convert reuse_worktree_file to struct object_id diff: convert fill_filespec to struct object_id ...	2017-06-19 12:38:44 -07:00
Brandon Williams	fda94b416e	tree-diff: convert diff_tree_paths to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:23:58 +09:00
Brandon Williams	66f414f885	diff-tree: convert diff_tree_sha1 to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:23:58 +09:00
Junio C Hamano	583c6a2295	Merge branch 'js/blame-lib' The internal logic used in "git blame" has been libified to make it easier to use by cgit. * js/blame-lib: (29 commits) blame: move entry prepend to libgit blame: move scoreboard setup to libgit blame: move scoreboard-related methods to libgit blame: move fake-commit-related methods to libgit blame: move origin-related methods to libgit blame: move core structures to header blame: create entry prepend function blame: create scoreboard setup function blame: create scoreboard init function blame: rework methods that determine 'final' commit blame: wrap blame_sort and compare_blame_final blame: move progress updates to a scoreboard callback blame: make sanity_check use a callback in scoreboard blame: move no_whole_file_rename flag to scoreboard blame: move xdl_opts flags to scoreboard blame: move show_root flag to scoreboard blame: move reverse flag to scoreboard blame: move contents_from to scoreboard blame: move copy/move thresholds to scoreboard blame: move stat counters to scoreboard ...	2017-06-05 09:18:12 +09:00
Brandon Williams	7b8dea0c75	tree-diff: convert diff_root_tree_sha1 to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:30 +09:00
Brandon Williams	b9acf54dbd	combine-diff: convert diff_tree_combined to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	bd25f28876	diff: convert diff_flush_patch_id to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	94a0097a41	diff: convert diff_change to struct object_id Convert diff_change to take a struct object_id. In addition convert the function pointer type 'change_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	c26022ea8f	diff: convert diff_addremove to struct object_id Convert diff_addremove to take a struct object_id. In addtion convert the function pointer type 'add_remove_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Jeff Smith	3a35cb2ea8	blame: move textconv_object with related functions textconv_object is used in places other than blame.c and should be moved to a more appropriate location. Other textconv related functions are located in diff.c so that seems as good a place as any. Signed-off-by: Jeff Smith <whydoubt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 15:41:50 +09:00
brian m. carlson	944cffbd18	diff-lib: convert do_diff_cache to struct object_id This is needed to convert parse_tree_indirect. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	910650d2f8	Rename sha1_array to oid_array Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-31 08:33:56 -07:00
Junio C Hamano	650360210a	Merge branch 'nd/ita-empty-commit' When new paths were added by "git add -N" to the index, it was enough to circumvent the check by "git commit" to refrain from making an empty commit without "--allow-empty". The same logic prevented "git status" to show such a path as "new file" in the "Changes not staged for commit" section. * nd/ita-empty-commit: commit: don't be fooled by ita entries when creating initial commit commit: fix empty commit creation when there's no changes but ita entries diff: add --ita-[in]visible-in-index diff-lib: allow ita entries treated as "not yet exist in index"	2016-10-27 14:58:50 -07:00
Jeff King	d6cece51b8	diff_aligned_abbrev: use "struct oid" Since we're modifying this function anyway, it's a good time to update it to the more modern "struct oid". We can also drop some of the magic numbers in favor of GIT_SHA1_HEXSZ, along with some descriptive comments. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Jeff King	d5e3b01e5b	diff_unique_abbrev: rename to diff_aligned_abbrev The word "align" describes how the function actually differs from find_unique_abbrev, and will make it less confusing when we add more diff-specific abbrevation functions that do not do this alignment. Since this is a globally available function, let's also move its descriptive comment to the header file, where we typically document function interfaces. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Nguyễn Thái Ngọc Duy	018ec3c820	commit: fix empty commit creation when there's no changes but ita entries If i-t-a entries are present and there is no change between the index and HEAD i-t-a entries, index_differs_from() still returns "dirty, new entries" (aka, the resulting commit is not empty), but cache-tree will skip i-t-a entries and produce the exact same tree of current commit. index_differs_from() is supposed to catch this so we can abort git-commit (unless --no-empty is specified). Update it to optionally ignore i-t-a entries when doing a diff between the index and HEAD so that it would return "no change" in this case and abort commit. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-24 10:48:23 -07:00
Nguyễn Thái Ngọc Duy	425a28e0a4	diff-lib: allow ita entries treated as "not yet exist in index" When comparing the index and the working tree to show which paths are new, and comparing the tree recorded in the HEAD and the index to see if committing the contents recorded in the index would result in an empty commit, we would want the former comparison to say "these are new paths" and the latter to say "there is no change" for paths that are marked as intent-to-add. We made a similar attempt at `d95d728a` ("diff-lib.c: adjust position of i-t-a entries in diff", 2015-03-16), which redefined the semantics of these two comparison modes globally, which was a disaster and had to be reverted at `78cc1a54` ("Revert "diff-lib.c: adjust position of i-t-a entries in diff"", 2015-06-23). To make sure we do not repeat the same mistake, introduce a new internal diffopt option so that this different semantics can be asked for only by callers that ask it, while making sure other unaudited callers will get the same comparison result. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-24 10:47:28 -07:00
Junio C Hamano	b7af6ae5cf	Merge branch 'mh/diff-indent-heuristic' Output from "git diff" can be made easier to read by selecting which lines are common and which lines are added/deleted intelligently when the lines before and after the changed section are the same. A command line option is added to help with the experiment to find a good heuristics. * mh/diff-indent-heuristic: blame: honor the diff heuristic options and config parse-options: add parse_opt_unknown_cb() diff: improve positioning of add/delete blocks in diffs xdl_change_compact(): introduce the concept of a change group recs_match(): take two xrecord_t pointers as arguments is_blank_line(): take a single xrecord_t as argument xdl_change_compact(): only use heuristic if group can't be matched xdl_change_compact(): fix compaction heuristic to adjust ixo	2016-09-26 16:09:16 -07:00
Michael Haggerty	5b162879e9	blame: honor the diff heuristic options and config Teach "git blame" and "git annotate" the --compaction-heuristic and --indent-heuristic options that are now supported by "git diff". Also teach them to honor the `diff.compactionHeuristic` and `diff.indentHeuristic` configuration options. It would be conceivable to introduce separate configuration options for "blame" and "annotate"; for example `blame.compactionHeuristic` and `blame.indentHeuristic`. But it would be confusing to users if blame output is inconsistent with diff output, so it makes more sense for them to respect the same configuration. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-19 10:25:11 -07:00
Junio C Hamano	305d7f1339	Merge branch 'jk/diff-submodule-diff-inline' The "git diff --submodule={short,log}" mechanism has been enhanced to allow "--submodule=diff" to show the patch between the submodule commits bound to the superproject. * jk/diff-submodule-diff-inline: diff: teach diff to display submodule difference with an inline diff submodule: refactor show_submodule_summary with helper function submodule: convert show_submodule_summary to use struct object_id * allow do_submodule_path to work even if submodule isn't checked out diff: prepare for additional submodule formats graph: add support for --line-prefix on all graph-aware output diff.c: remove output_prefix_length field cache: add empty_tree_oid object and helper function	2016-09-12 15:34:31 -07:00
Jacob Keller	fd47ae6a5b	diff: teach diff to display submodule difference with an inline diff Teach git-diff and friends a new format for displaying the difference of a submodule. The new format is an inline diff of the contents of the submodule between the commit range of the update. This allows the user to see the actual code change caused by a submodule update. Add tests for the new format and option. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:10 -07:00
Jacob Keller	61cfbc054d	diff: prepare for additional submodule formats A future patch will add a new format for displaying the difference of a submodule. Make it easier by changing how we store the current selected format. Replace the DIFF_OPT flag with an enumeration, as each format will be mutually exclusive. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:09 -07:00
Jacob Keller	660e113ce1	graph: add support for --line-prefix on all graph-aware output Add an extension to git-diff and git-log (and any other graph-aware displayable output) such that "--line-prefix=<string>" will print the additional line-prefix on every line of output. To make this work, we have to fix a few bugs in the graph API that force graph_show_commit_msg to be used only when you have a valid graph. Additionally, we extend the default_diff_output_prefix handler to work even when no graph is enabled. This is somewhat of a hack on top of the graph API, but I think it should be acceptable here. This will be used by a future extension of submodule display which displays the submodule diff as the actual diff between the pre and post commit in the submodule project. Add some tests for both git-log and git-diff to ensure that the prefix is honored correctly. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:09 -07:00
Junio C Hamano	cd48dadb8d	diff.c: remove output_prefix_length field "diff/log --stat" has a logic that determines the display columns available for the diffstat part of the output and apportions it for pathnames and diffstat graph automatically. `5e71a84a` (Add output_prefix_length to diff_options, 2012-04-16) added the output_prefix_length field to diff_options structure to allow this logic to subtract the display columns used for the history graph part from the total "terminal width"; this matters when the "git log --graph -p" option is in use. The field must be set to the number of display columns needed to show the output from the output_prefix() callback, which is error prone. As there is only one user of the field, and the user has the actual value of the prefix string, let's get rid of the field and have the user count the display width itself. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:08 -07:00
Kevin Willford	3e8e32c32e	patch-ids: add flag to create the diff patch id using header only data This will allow a diff patch id to be created using only the header data so that the contents of the file will not have to be loaded. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 14:10:01 -07:00
Junio C Hamano	5d2a30d7d8	Merge branch 'mm/diff-renames-default' The end-user facing Porcelain level commands like "diff" and "log" now enables the rename detection by default. * mm/diff-renames-default: diff: activate diff.renames by default log: introduce init_log_defaults() t: add tests for diff.renames (true/false/unset) t4001-diff-rename: wrap file creations in a test Documentation/diff-config: fix description of diff.renames	2016-04-03 10:29:22 -07:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Junio C Hamano	3ed26a44b3	Merge branch 'jk/more-comments-on-textconv' The memory ownership rule of fill_textconv() API, which was a bit tricky, has been documented a bit better. * jk/more-comments-on-textconv: diff: clarify textconv interface	2016-02-26 13:37:15 -08:00
Matthieu Moy	5404c116aa	diff: activate diff.renames by default Rename detection is a very convenient feature, and new users shouldn't have to dig in the documentation to benefit from it. Potential objections to activating rename detection are that it sometimes fail, and it is sometimes slow. But rename detection is already activated by default in several cases like "git status" and "git merge", so activating diff.renames does not fundamentally change the situation. When the rename detection fails, it now fails consistently between "git diff" and "git status". This setting does not affect plumbing commands, hence well-written scripts will not be affected. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-25 11:31:02 -08:00
Jeff King	a64e6a44c6	diff: clarify textconv interface The memory allocation scheme for the textconv interface is a bit tricky, and not well documented. It was originally designed as an internal part of diff.c (matching fill_mmfile), but gradually was made public. Refactoring it is difficult, but we can at least improve the situation by documenting the intended flow and enforcing it with an in-code assertion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 10:40:35 -08:00
Jeff King	5b442c4f27	tree-diff: catch integer overflow in combine_diff_path allocation A combine_diff_path struct has two "flex" members allocated alongside the struct: a string to hold the pathname, and an array of parent pointers. We use an "int" to compute this, meaning we may easily overflow it if the pathname is extremely long. We can fix this by using size_t, and checking for overflow with the st_add helper. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-19 09:40:37 -08:00
Junio C Hamano	02dab5d399	Merge branch 'nd/diff-with-path-params' into maint A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-05 14:54:15 -08:00
Junio C Hamano	c167a96e68	Merge branch 'nd/diff-with-path-params' A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-03 14:16:04 -08:00
Duy Nguyen	a97262c62f	diff: make -O and --output work in subdirectory Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-21 10:45:13 -08:00
Nguyễn Thái Ngọc Duy	e5f7a5d16f	diff-no-index: do not take a redundant prefix argument Prefix is already set up in "revs". The same prefix should be used for all options parsing. So kill the last argument. This patch does not actually change anything because the only caller does use the same prefix for init_revisions() and diff_no_index(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-21 10:45:11 -08:00
Jeff King	9a93c6686f	avoid shifting signed integers 31 bits We sometimes use 32-bit unsigned integers as bit-fields. It's fine to access the MSB, because it's unsigned. However, doing so as "1 << 31" is wrong, because the constant "1" is a signed int, and we shift into the sign bit, causing undefined behavior. We can fix this by using "1U" as the constant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-04 09:51:16 -08:00
David Turner	076c98372e	log: add "log.follow" configuration variable People who work on projects with mostly linear history with frequent whole file renames may want to always use "git log --follow" when inspecting the life of the content that live in a single path. Teach the command to behave as if "--follow" was given from the command line when log.follow configuration variable is set and there is one (and only one) path on the command line. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-07-09 10:24:23 -07:00
Junio C Hamano	db65170ee5	Merge branch 'jk/color-diff-plain-is-context' "color.diff.plain" was a misnomer; give it 'color.diff.context' as a more logical synonym. * jk/color-diff-plain-is-context: diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT diff: accept color.diff.context as a synonym for "plain"	2015-06-11 09:29:53 -07:00
Junio C Hamano	709cd912d4	Merge branch 'jc/diff-ws-error-highlight' Allow whitespace breakages in deleted and context lines to be also painted in the output. * jc/diff-ws-error-highlight: diff.c: --ws-error-highlight=<kind> option diff.c: add emit_del_line() and emit_context_line() t4015: separate common setup and per-test expectation t4015: modernise style	2015-06-11 09:29:51 -07:00
Jeff King	8dbf3eb685	diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT The latter is a much more descriptive name (and we support "color.diff.context" now). This also updates the name of any local variables which were used to store the color. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-27 13:54:42 -07:00
Junio C Hamano	b8767f791c	diff.c: --ws-error-highlight=<kind> option Traditionally, we only cared about whitespace breakages introduced in new lines. Some people want to paint whitespace breakages on old lines, too. When they see a whitespace breakage on a new line, they can spot the same kind of whitespace breakage on the corresponding old line and want to say "Ah, those breakages are there but they were inherited from the original, so let's not touch them for now." Introduce `--ws-error-highlight=<kind>` option, that lets them pass a comma separated list of `old`, `new`, and `context` to specify what lines to highlight whitespace errors on. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-26 23:00:01 -07:00
brian m. carlson	1ff57c13c5	diff: convert struct combine_diff_path to object_id Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-13 22:43:13 -07:00
Junio C Hamano	8eaf517835	Merge branch 'ks/tree-diff-nway' Instead of running N pair-wise diff-trees when inspecting a N-parent merge, find the set of paths that were touched by walking N+1 trees in parallel. These set of paths can then be turned into N pair-wise diff-tree results to be processed through rename detections and such. And N=2 case nicely degenerates to the usual 2-way diff-tree, which is very nice. * ks/tree-diff-nway: mingw: activate alloca combine-diff: speed it up, by using multiparent diff tree-walker directly tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Portable alloca for Git tree-diff: reuse base str(buf) memory on sub-tree recursion tree-diff: no need to call "full" diff_tree_sha1 from show_path() tree-diff: rework diff_tree interface to be sha1 based tree-diff: diff_tree() should now be static tree-diff: remove special-case diff-emitting code for empty-tree cases tree-diff: simplify tree_entry_pathcmp tree-diff: show_path prototype is not needed anymore tree-diff: rename compare_tree_entry -> tree_entry_pathcmp tree-diff: move all action-taking code out of compare_tree_entry() tree-diff: don't assume compare_tree_entry() returns -1,0,1 tree-diff: consolidate code for emitting diffs and recursion in one place tree-diff: show_tree() is not needed tree-diff: no need to pass match to skip_uninteresting() tree-diff: no need to manually verify that there is no mode change for a path combine-diff: move changed-paths scanning logic into its own function combine-diff: move show_log_first logic/action out of paths scanning	2014-06-03 12:06:40 -07:00
Kirill Smelkov	72441af7c4	tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-07 14:40:46 -07:00
Kirill Smelkov	ad6f3cc7d2	tree-diff: diff_tree() should now be static We reworked all its users to use the functionality through diff_tree_sha1 variant in recent patches (see "tree-diff: allow diff_tree_sha1 to accept NULL sha1" and what comes next). diff_tree() is now not used outside tree-diff.c - make it static. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-26 14:30:47 -07:00
Junio C Hamano	2687ffdeb7	Merge branch 'jc/hold-diff-remove-q-synonym-for-no-deletion' Remove a confusing and deprecated "-q" option from "git diff-files"; "git diff-files --diff-filter=d" can be used instead.	2014-03-07 15:17:41 -08:00
Kirill Smelkov	af82c7880f	combine-diff: combine_diff_path.len is not needed anymore The field was used in order to speed-up name comparison and also to mark removed paths by setting it to 0. Because the updated code does significantly less strcmp and also just removes paths from the list and free right after we know a path will not be needed, it is not needed anymore. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:44:57 -08:00
Junio C Hamano	73b063130b	Merge branch 'tg/diff-no-index-refactor' "git diff ../else/where/A ../else/where/B" when ../else/where is clearly outside the repository, and "git diff --no-index A B", do not have to look at the index at all, but we used to read the index unconditionally. * tg/diff-no-index-refactor: diff: avoid some nesting diff: add test for --no-index executed outside repo diff: don't read index when --no-index is given diff: move no-index detection to builtin/diff.c	2013-12-27 14:58:17 -08:00
Junio C Hamano	6904f9aa5b	Merge branch 'zk/difftool-counts' Show the total number of paths and the number of paths shown so far when "git difftool" prompts to launch an external diff tool, which would give users some sense of progress. * zk/difftool-counts: diff.c: fix some recent whitespace style violations difftool: display the number of files in the diff queue in the prompt	2013-12-27 14:58:13 -08:00
Thomas Gummerer	470faf9654	diff: move no-index detection to builtin/diff.c Currently the --no-index option is parsed in diff_no_index(). Move the detection if a no-index diff should be executed to builtin/diff.c, where we can use it for executing diff_no_index() conditionally. This will also allow us to execute other operations conditionally, which will be done in the next patch. There are no functional changes. Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-12 12:23:02 -08:00
Zoltan Klinger	ee7fb0b1d4	difftool: display the number of files in the diff queue in the prompt When --prompt option is set, git-difftool displays a prompt for each modified file to be viewed in an external diff program. At that point, it could be useful to display a counter and the total number of files in the diff queue. Below is the current difftool prompt for the first of 5 modified files: Viewing: 'diff.c' Launch 'vimdiff' [Y/n]: Consider the modified prompt: Viewing (1/5): 'diff.c' Launch 'vimdiff' [Y/n]: The current GIT_EXTERNAL_DIFF mechanism does not tell the number of paths in the diff queue nor the current counter. To make this "counter/total" info available for GIT_EXTERNAL_DIFF programs without breaking existing ones by doing the following: - Keep track of the number of paths shown so far in diff_options; - Export two new environment variables from run_external_diff() to show the total number of paths (from diff_queue_struct) and the current value of the counter (from diff_options); and - Update git-difftool--helper to use these two environment variables. Signed-off-by: Zoltan Klinger <zoltan.klinger@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-06 14:00:27 -08:00
Nicolas Vigier	b0d12fc9b2	Use the word 'stuck' instead of 'sticked' The past participle of 'stick' is 'stuck'. Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-31 15:47:38 -07:00
Junio C Hamano	4197361e39	Merge branch 'mg/more-textconv' Make "git grep" and "git show" pay attention to --textconv when dealing with blob objects. * mg/more-textconv: grep: honor --textconv for the case rev:path grep: allow to use textconv filters t7008: demonstrate behavior of grep with textconv cat-file: do not die on --textconv without textconv filters show: honor --textconv for blobs diff_opt: track whether flags have been set explicitly t4030: demonstrate behavior of show with textconv	2013-10-23 13:21:31 -07:00
Junio C Hamano	b02f5aeda6	Merge branch 'jl/submodule-mv' "git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...	2013-09-09 14:36:15 -07:00
Junio C Hamano	c48f6816f0	diff: remove "diff-files -q" in a version of Git in a distant future This was inherited from "show-diff -q" that was invented to tell comparison between the index and the working tree to ignore only removals in 2005. These days, it is spelled as "--diff-filter=d". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-19 15:22:29 -07:00
Junio C Hamano	95a7c546b0	diff: deprecate -q option to diff-files This reimplements the ancient "-q" option to "git diff-files" that was inherited from "show-diff -q" in terms of "--diff-filter=d". We will be deprecating the "-q" option, so let's issue a warning when we do so. Incidentally this also tentatively fixes "git diff --no-index" to honor "-q" and hide deletions; the use will get the same warning. We should remove the support for "-q" in a future version but it is not that urgent. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-19 15:20:47 -07:00
Junio C Hamano	1ecc1cbd3a	diff: preparse --diff-filter string argument Instead of running strchr() on the list of status characters over and over again, parse the --diff-filter option into bitfields and use the bits to see if the change to the filepair matches the status requested. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 16:23:34 -07:00
Nguyễn Thái Ngọc Duy	bd1928df1d	remove diff_tree_{setup,release}_paths Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	64acde94ef	move struct pathspec and related functions to pathspec.[ch] Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:06 -07:00
Junio C Hamano	6c374008b1	diff_opt: track whether flags have been set explicitly The diff_opt infrastructure sets flags based on defaults and command line options. It is impossible to tell whether a flag has been set as a default or on explicit request. Update the structure so that this detection is possible: * Add an extra "opt->touched_flags" that keeps track of all the fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR. * You may continue setting the default values to the flags, like commands in the "log" family do in cmd_log_init_defaults(), but after you finished setting the defaults, you clear the touched_flags field; * And then you let the usual callchain call diff_opt_parse(), allowing the opt->flags be set or unset, while keeping track of which bits the user touched; * There is an optional callback "opt->set_default" that is called at the very beginning to let you inspect touched_flags and update opt->flags appropriately, before the remainder of the diffcore machinery is set up, taking the opt->flags value into account. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-10 10:24:17 -07:00
Junio C Hamano	abea4dc76a	Merge branch 'mp/diff-algo-config' Add diff.algorithm configuration so that the user does not type "diff --histogram". * mp/diff-algo-config: diff: Introduce --diff-algorithm command line option config: Introduce diff.algorithm variable git-completion.bash: Autocomplete --minimal and --histogram for git-diff	2013-02-17 15:25:52 -08:00
John Keeping	f192223447	diff: add diff_line_prefix function This is a helper function to call the diff output_prefix function and return its value as a C string, allowing us to greatly simplify everywhere that needs to get the output prefix. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-12 11:42:07 -08:00
Michal Privoznik	07924d4d50	diff: Introduce --diff-algorithm command line option Since command line options have higher priority than config file variables and taking previous commit into account, we need a way how to specify myers algorithm on command line. However, inventing `--myers` is not the right answer. We need far more general option, and that is `--diff-algorithm`. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-16 09:41:18 -08:00
Nguyễn Thái Ngọc Duy	4914c9629c	Move setup_diff_pager to libgit.a This is used by diff-no-index.c, part of libgit.a while it stays in builtin/diff.c. Move it to diff.c so that we won't get undefined reference if a program that uses libgit.a happens to pull it in. While at it, move check_pager from git.c to pager.c. It makes more sense there and pager.c is also part of libgit.a Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jeff King <peff@peff.net>	2012-10-29 03:08:30 -04:00
Junio C Hamano	d2aea1371b	diff.c: mark a private file-scope symbol as static Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-09-15 22:58:20 -07:00
Junio C Hamano	3b753148b6	Merge branch 'jk/maint-null-in-trees' We do not want a link to 0{40} object stored anywhere in our objects. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value	2012-08-27 11:54:28 -07:00
Junio C Hamano	9cd33bbc52	Merge branch 'tr/void-diff-setup-done' Remove unnecessary code. * tr/void-diff-setup-done: diff_setup_done(): return void	2012-08-22 11:52:27 -07:00
Thomas Rast	28452655af	diff_setup_done(): return void diff_setup_done() has historically returned an error code, but lost the last nonzero return in `943d5b7` (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-08-03 12:11:07 -07:00
Jeff King	e54501004a	diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-29 15:04:32 -07:00
Junio C Hamano	73ff8cf784	Merge branch 'lp/diffstat-with-graph' "log --graph" was not very friendly with "--stat" option and its output had line breaks at wrong places. By Lucian Poston (5) and Zbigniew Jędrzejewski-Szmek (2) * lp/diffstat-with-graph: t4052: work around shells unable to set COLUMNS to 1 Prevent graph_width of stat width from falling below min t4052: Test diff-stat output with minimum columns t4052: Adjust --graph --stat output for prefixes Adjust stat width calculations to take --graph output into account Add output_prefix_length to diff_options t4052: test --stat output with --graph	2012-05-02 13:51:59 -07:00
Junio C Hamano	c0599f6993	Merge branch 'jk/diff-no-rename-empty' Forbids rename detection logic from matching two empty files as renames during merge-recursive to prevent mismerges. By Jeff King * jk/diff-no-rename-empty: merge-recursive: don't detect renames of empty files teach diffcore-rename to optionally ignore empty content make is_empty_blob_sha1 available everywhere drop casts from users EMPTY_TREE_SHA1_BIN	2012-04-16 12:41:49 -07:00
Lucian Poston	5e71a84a2d	Add output_prefix_length to diff_options Add output_prefix_length to diff_options. Initialize the value to 0 and only set it when graph.c:diff_output_prefix_callback() is called. Signed-off-by: Lucian Poston <lucian.poston@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-16 11:28:30 -07:00
Junio C Hamano	86c340e082	Merge branch 'jc/diff-algo-cleanup' Resurrects the preparatory clean-up patches from another topic that was discarded, as this would give a saner foundation to build on diff.algo configuration option series. * jc/diff-algo-cleanup: xdiff: PATIENCE/HISTOGRAM are not independent option bits xdiff: remove XDL_PATCH_* macros	2012-04-15 22:51:15 -07:00
Jeff King	90d43b0768	teach diffcore-rename to optionally ignore empty content Our rename detection is a heuristic, matching pairs of removed and added files with similar or identical content. It's unlikely to be wrong when there is actual content to compare, and we already take care not to do inexact rename detection when there is not enough content to produce good results. However, we always do exact rename detection, even when the blob is tiny or empty. It's easy to get false positives with an empty blob, simply because it is an obvious content to use as a boilerplate (e.g., when telling git that an empty directory is worth tracking via an empty .gitignore). This patch lets callers specify whether or not they are interested in using empty files as rename sources and destinations. The default is "yes", keeping the original behavior. It works by detecting the empty-blob sha1 for rename sources and destinations. One more flexible alternative would be to allow the caller to specify a minimum size for a blob to be "interesting" for rename detection. But that would catch small boilerplate files, not large ones (e.g., if you had the GPL COPYING file in many directories). A better alternative would be to allow a "-rename" gitattribute to allow boilerplate files to be marked as such. I'll leave the complexity of that solution until such time as somebody actually wants it. The complaints we've seen so far revolve around empty files, so let's start with the simple thing. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-23 13:52:49 -07:00
Junio C Hamano	1e4d0875ac	Merge branch 'jc/pickaxe-ignore-case' By Junio C Hamano (2) and Ramsay Jones (1) * jc/pickaxe-ignore-case: ctype.c: Fix a sparse warning pickaxe: allow -i to search in patch case-insensitively grep: use static trans-case table	2012-03-07 12:12:59 -08:00
Junio C Hamano	af050219e4	Merge branch 'zj/diff-stat-dyncol' By Zbigniew Jędrzejewski-Szmek (8) and Junio C Hamano (1) * zj/diff-stat-dyncol: : This breaks tests. Perhaps it is not worth using the decimal-width stuff : for this series, at least initially. diff --stat: add config option to limit graph width diff --stat: enable limiting of the graph part diff --stat: add a test for output with COLUMNS=40 diff --stat: use a maximum of 5/8 for the filename part merge --stat: use the full terminal width log --stat: use the full terminal width show --stat: use the full terminal width diff --stat: use the full terminal width diff --stat: tests for long filenames and big change counts	2012-03-06 14:53:06 -08:00
Zbigniew Jędrzejewski-Szmek	969fe57b84	diff --stat: enable limiting of the graph part A new option --stat-graph-width=<width> can be used to limit the width of the graph part even is more space is available. Up to <width> columns will be used for the graph. If commits changing a lot of lines are displayed in a wide terminal window (200 or more columns), and the +- graph uses the full width, the output can be hard to comfortably scan with a horizontal movement of human eyes. Messages wrapped to about 80 columns would be interspersed with very long +- lines. It makes sense to limit the width of the graph part to a fixed value (e.g. 70 columns), even if more columns are available. Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-01 09:15:47 -08:00
Junio C Hamano	accccde483	pickaxe: allow -i to search in patch case-insensitively "git log -S<string>" is a useful way to find the last commit in the codebase that touched the <string>. As it was designed to be used by a porcelain script to dig the history starting from a block of text that appear in the starting commit, it never had to look for anything but an exact match. When used by an end user who wants to look for the last commit that removed a string (e.g. name of a variable) that he vaguely remembers, however, it is useful to support case insensitive match. When given the "--regexp-ignore-case" (or "-i") option, which originally was designed to affect case sensitivity of the search done in the commit log part, e.g. "log --grep", the matches made with -S/-G pickaxe search is done case insensitively now. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-02-28 16:15:29 -08:00
Junio C Hamano	307ab20b33	xdiff: PATIENCE/HISTOGRAM are not independent option bits Because the default Myers, patience and histogram algorithms cannot be in effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not independent bits. Instead of wasting one bit per algorithm, define a few macros to access the few bits they occupy and update the code that access them. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-02-19 15:36:55 -08:00

1 2 3 4 5 ...

334 Commits