git-commit-vandalism

Author	SHA1	Message	Date
Stefan Beller	dfb7728f63	diff.c: move line ending check into emit_hunk_header The emit_hunk_header() function is responsible for assembling a hunk header and calling emit_line() to send the hunk header to the output file. Its only caller fn_out_consume() needs to prepare for a case where the function emits an incomplete line and add the terminating LF. Instead make sure emit_hunk_header() to always send a completed line to emit_line(). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	f2d2a5def0	diff.c: readability fix We already have dereferenced 'p->two' into a local variable 'two'. Use that. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Junio C Hamano	50f03c6676	Merge branch 'ab/free-and-null' A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL	2017-06-24 14:28:41 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Junio C Hamano	5812b3f73b	Merge branch 'bw/ls-files-sans-the-index' Code clean-up. * bw/ls-files-sans-the-index: ls-files: factor out tag calculation ls-files: factor out debug info into a function ls-files: convert show_files to take an index ls-files: convert show_ce_entry to take an index ls-files: convert prune_cache to take an index ls-files: convert ce_excluded to take an index ls-files: convert show_ru_info to take an index ls-files: convert show_other_files to take an index ls-files: convert show_killed_files to take an index ls-files: convert write_eolinfo to take an index ls-files: convert overlay_tree_on_cache to take an index tree: convert read_tree to take an index parameter convert: convert renormalize_buffer to take an index convert: convert convert_to_git to take an index convert: convert convert_to_git_filter_fd to take an index convert: convert crlf_to_git to take an index convert: convert get_cached_convert_stats_ascii to take an index	2017-06-24 14:28:40 -07:00
Junio C Hamano	a6f38c109b	Merge branch 'bw/object-id' Conversion from uchar[20] to struct object_id continues. * bw/object-id: (33 commits) diff: rename diff_fill_sha1_info to diff_fill_oid_info diffcore-rename: use is_empty_blob_oid tree-diff: convert path_appendnew to object_id tree-diff: convert diff_tree_paths to struct object_id tree-diff: convert try_to_follow_renames to struct object_id builtin/diff-tree: cleanup references to sha1 diff-tree: convert diff_tree_sha1 to struct object_id notes-merge: convert write_note_to_worktree to struct object_id notes-merge: convert verify_notes_filepair to struct object_id notes-merge: convert find_notes_merge_pair_ps to struct object_id notes-merge: convert merge_from_diffs to struct object_id notes-merge: convert notes_merge* to struct object_id tree-diff: convert diff_root_tree_sha1 to struct object_id combine-diff: convert find_paths_* to struct object_id combine-diff: convert diff_tree_combined to struct object_id diff: convert diff_flush_patch_id to struct object_id patch-ids: convert to struct object_id diff: finish conversion for prepare_temp_file to struct object_id diff: convert reuse_worktree_file to struct object_id diff: convert fill_filespec to struct object_id ...	2017-06-19 12:38:44 -07:00
Ævar Arnfjörð Bjarmason	6a83d90207	coccinelle: make use of the "type" FREE_AND_NULL() rule Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-16 12:44:03 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	b9a7d55d93	Merge branch 'nd/fopen-errors' We often try to open a file for reading whose existence is optional, and silently ignore errors from open/fopen; report such errors if they are not due to missing files. * nd/fopen-errors: mingw_fopen: report ENOENT for invalid file names mingw: verify that paths are not mistaken for remote nicknames log: fix memory leak in open_next_file() rerere.c: move error_errno() closer to the source system call print errno when reporting a system call error wrapper.c: make warn_on_inaccessible() static wrapper.c: add and use fopen_or_warn() wrapper.c: add and use warn_on_fopen_errors() config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD clone: use xfopen() instead of fopen() use xfopen() in more places git_fopen: fix a sparse 'not declared' warning	2017-06-13 13:47:09 -07:00
Brandon Williams	82b474e025	convert: convert convert_to_git to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-13 11:40:51 -07:00
Brandon Williams	94e327e973	diff: rename diff_fill_sha1_info to diff_fill_oid_info Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:23:58 +09:00
Junio C Hamano	583c6a2295	Merge branch 'js/blame-lib' The internal logic used in "git blame" has been libified to make it easier to use by cgit. * js/blame-lib: (29 commits) blame: move entry prepend to libgit blame: move scoreboard setup to libgit blame: move scoreboard-related methods to libgit blame: move fake-commit-related methods to libgit blame: move origin-related methods to libgit blame: move core structures to header blame: create entry prepend function blame: create scoreboard setup function blame: create scoreboard init function blame: rework methods that determine 'final' commit blame: wrap blame_sort and compare_blame_final blame: move progress updates to a scoreboard callback blame: make sanity_check use a callback in scoreboard blame: move no_whole_file_rename flag to scoreboard blame: move xdl_opts flags to scoreboard blame: move show_root flag to scoreboard blame: move reverse flag to scoreboard blame: move contents_from to scoreboard blame: move copy/move thresholds to scoreboard blame: move stat counters to scoreboard ...	2017-06-05 09:18:12 +09:00
Junio C Hamano	53083f8547	Merge branch 'mb/diff-default-to-indent-heuristics' Make the "indent" heuristics the default in "diff" and diff.indentHeuristics configuration variable an escape hatch for those who do no want it. * mb/diff-default-to-indent-heuristics: add--interactive: drop diff.indentHeuristic handling diff: enable indent heuristic by default diff: have the diff-* builtins configure diff before initializing revisions diff: make the indent heuristic part of diff's basic configuration	2017-06-05 09:18:10 +09:00
Brandon Williams	bd25f28876	diff: convert diff_flush_patch_id to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	74014152be	diff: finish conversion for prepare_temp_file to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	fb4a1c0dc8	diff: convert reuse_worktree_file to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	f9704c2d82	diff: convert fill_filespec to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	94a0097a41	diff: convert diff_change to struct object_id Convert diff_change to take a struct object_id. In addition convert the function pointer type 'change_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	c26022ea8f	diff: convert diff_addremove to struct object_id Convert diff_addremove to take a struct object_id. In addtion convert the function pointer type 'add_remove_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Junio C Hamano	6b526ced6f	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-29 12:34:43 +09:00
Nguyễn Thái Ngọc Duy	23a9e0712d	use xfopen() in more places xfopen() - provides error details - explains error on reading, or writing, or whatever operation - has l10n support - prints file name in the error Some of these are missing in the places that are replaced with xfopen(), which is a clear win. In some other places, it's just less code (not as clearly a win as the previous case but still is). The only slight regresssion is in remote-testsvn, where we don't report the file class (marks files) in the error messages anymore. But since this is a _test_ svn remote transport, I'm not too concerned. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:33:55 +09:00
Jeff Smith	3a35cb2ea8	blame: move textconv_object with related functions textconv_object is used in places other than blame.c and should be moved to a more appropriate location. Other textconv related functions are located in diff.c so that seems as good a place as any. Signed-off-by: Jeff Smith <whydoubt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 15:41:50 +09:00
Stefan Beller	33de716387	diff: enable indent heuristic by default The feature was included in v2.11 (released 2016-11-29) and we got no negative feedback. Quite the opposite, all feedback we got was positive. Turn it on by default. Users who dislike the feature can turn it off by setting diff.indentHeuristic (which also configures plumbing commands, see prior patches). The change to t/t4051-diff-function-context.sh is needed because the heuristic shifts the changed hunk in the patch. To get the same result regardless of the heuristic configuration, we modify the test file differently: We insert a completely new line after line 2, instead of simply duplicating it. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-09 12:24:35 +09:00
Marc Branchaud	cf5e77223a	diff: make the indent heuristic part of diff's basic configuration This heuristic was originally introduced as an experimental feature, and therefore part of the UI configuration. But the user often sees diffs generated by plumbing commands like diff-tree. Moving the indent heuristic into diff's basic configuration prepares the way for diff plumbing commands to respect the setting. The heuristic itself merely makes the diffs more aesthetically pleasing, without changing their correctness. Scripts that rely on the diff plumbing commands should not care whether or not the heuristic is employed. Signed-off-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-09 12:24:34 +09:00
brian m. carlson	569aa376ea	notes-cache: convert to struct object_id Convert as many instances of unsigned char [20] as possible. Update the callers of notes_cache_get and notes_cache_put to use the new interface. Among the functions updated are callers of lookup_commit_reference_gently, which we will soon convert. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
René Genz	5621760f59	fix minor typos Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: René Genz <liebundartig@freenet.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-01 11:01:52 +09:00
Junio C Hamano	b1081e4004	Merge branch 'bc/object-id' Conversion from unsigned char [40] to struct object_id continues. * bc/object-id: Documentation: update and rename api-sha1-array.txt Rename sha1_array to oid_array Convert sha1_array_for_each_unique and for_each_abbrev to object_id Convert sha1_array_lookup to take struct object_id Convert remaining callers of sha1_array_lookup to object_id Make sha1_array_append take a struct object_id * sha1-array: convert internal storage for struct sha1_array to object_id builtin/pull: convert to struct object_id submodule: convert check_for_new_submodule_commits to object_id sha1_name: convert disambiguate_hint_fn to take object_id sha1_name: convert struct disambiguate_state to object_id test-sha1-array: convert most code to struct object_id parse-options-cb: convert sha1_array_append caller to struct object_id fsck: convert init_skiplist to struct object_id builtin/receive-pack: convert portions to struct object_id builtin/pull: convert portions to struct object_id builtin/diff: convert to struct object_id Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Define new hash-size constants for allocating memory	2017-04-19 21:37:13 -07:00
Jeff King	977db6b4bf	diff: avoid fixed-size buffer for patch-ids To generate a patch id, we format the diff header into a fixed-size buffer, and then feed the result to our sha1 computation. The fixed buffer has size '4*PATH_MAX + 20', which in theory accommodates the four filenames plus some extra data. Except: 1. The filenames may not be constrained to PATH_MAX. The static value may not be a real limit on the current filesystem. Moreover, we may compute patch-ids for names stored only in git, without touching the current filesystem at all. 2. The 20 bytes is not nearly enough to cover the extra content we put in the buffer. As a result, the data we feed to the sha1 computation may be truncated, and it's possible that a commit with a very long filename could erroneously collide in the patch-id space with another commit. For instance, if one commit modified "really-long-filename/foo" and another modified "bar" in the same directory. In practice this is unlikely. Because the filenames are repeated, and because there's a single cutoff at the end of the buffer, the offending filename would have to be on the order of four times larger than PATH_MAX. We could fix this by moving to a strbuf. However, we can observe that the purpose of formatting this in the first place is to feed it to git_SHA1_Update(). So instead, let's just feed each part of the formatted string directly. This actually ends up more readable, and we can even factor out some duplicated bits from the various conditional branches. Technically this may change the output of patch-id for very long filenames, but it's not worth making an exception for this in the --stable output. It was a bug, and one that only affected an unlikely set of paths. And anyway, the exact value would have varied from platform to platform depending on the value of PATH_MAX, so there is no "stable" value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-30 14:58:29 -07:00
brian m. carlson	dc01505f7f	Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Since we will likely be introducing a new hash function at some point, and that hash function might be longer than 40 hex characters, use the constant GIT_MAX_HEXSZ, which is designed to be suitable for allocations, instead of GIT_SHA1_HEXSZ. This will ease the transition down the line by distinguishing between places where we need to allocate memory suitable for the largest hash from those where we need to handle the current hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-26 22:08:21 -07:00
Jeff King	e4da43b1f0	prefix_filename: return newly allocated string The prefix_filename() function returns a pointer to static storage, which makes it easy to use dangerously. We already fixed one buggy caller in hash-object recently, and the calls in apply.c are suspicious (I didn't dig in enough to confirm that there is a bug, but we call the function once in apply_all_patches() and then again indirectly from parse_chunk()). Let's make it harder to get wrong by allocating the return value. For simplicity, we'll do this even when the prefix is empty (and we could just return the original file pointer). That will cause us to allocate sometimes when we wouldn't otherwise need to, but this function isn't called in performance critical code-paths (and it already _might_ allocate on any given call, so a caller that cares about performance is questionable anyway). The downside is that the callers need to remember to free() the result to avoid leaking. Most of them already used xstrdup() on the result, so we know they are OK. The remainder have been converted to use free() as appropriate. I considered retaining a prefix_filename_unsafe() for cases where we know the static lifetime is OK (and handling the cleanup is awkward). This is only a handful of cases, though, and it's not worth the mental energy in worrying about whether the "unsafe" variant is OK to use in any situation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-21 11:18:41 -07:00
Jeff King	116fb64e43	prefix_filename: drop length parameter This function takes the prefix as a ptr/len pair, but in every caller the length is exactly strlen(ptr). Let's simplify the interface and just take the string. This saves callers specifying it (and in some cases handling a NULL prefix). In a handful of cases we had the length already without calling strlen, so this is technically slower. But it's not likely to matter (after all, if the prefix is non-empty we'll allocate and copy it into a buffer anyway). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-21 11:12:53 -07:00
Junio C Hamano	60f335b87f	Merge branch 'jc/diff-populate-filespec-size-only-fix' "git diff --quiet" relies on the size field in diff_filespec to be correctly populated, but diff_populate_filespec() helper function made an incorrect short-cut when asked only to populate the size field for paths that need to go through convert_to_git() (e.g. CRLF conversion). * jc/diff-populate-filespec-size-only-fix: diff: do not short-cut CHECK_SIZE_ONLY check in diff_populate_filespec()	2017-03-12 23:21:36 -07:00
Junio C Hamano	12426e114b	diff: do not short-cut CHECK_SIZE_ONLY check in diff_populate_filespec() Callers of diff_populate_filespec() can choose to ask only for the size of the blob without grabbing the blob data, and the function, after running lstat() when the filespec points at a working tree file, returns by copying the value in size field of the stat structure into the size field of the filespec when this is the case. However, this short-cut cannot be taken if the contents from the path needs to go through convert_to_git(), whose resulting real blob data may be different from what is in the working tree file. As "git diff --quiet" compares the .size fields of filespec structures to skip content comparison, this bug manifests as a false "there are differences" for a file that needs eol conversion, for example. Reported-by: Mike Crowe <mac@mcrowe.com> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 10:48:06 -08:00
Junio C Hamano	cbf1860d73	Merge branch 'rs/swap' Code clean-up. * rs/swap: graph: use SWAP macro diff: use SWAP macro use SWAP macro apply: use SWAP macro add SWAP macro	2017-02-15 12:54:19 -08:00
Junio C Hamano	e53c7f8731	Merge branch 'jk/log-graph-name-only' "git log --graph" did not work well with "--name-only", even though other forms of "diff" output were handled correctly. * jk/log-graph-name-only: diff: print line prefix for --name-only output	2017-02-10 12:52:27 -08:00
Jeff King	f5022b5fed	diff: print line prefix for --name-only output If you run "git log --graph --name-only", the pathnames are not indented to go along with their matching commits (unlike all of the other diff formats). We need to output the line prefix for each item before writing it. The tests cover both --name-status and --name-only. The former actually gets this right already, because it builds on the --raw format functions. It's only --name-only which uses its own code (and this fix mirrors the code in diff_flush_raw()). Note that the tests don't follow our usual style of setting up the "expect" output inside the test block. This matches the surrounding style, but more importantly it is easier to read: we don't have to worry about embedded single-quotes, and the leading indentation is more obvious. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-08 13:39:57 -08:00
René Scharfe	2490574d15	use oid_to_hex_r() for converting struct object_id hashes to hex strings Patch generated by Coccinelle and contrib/coccinelle/object_id.cocci. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:23:40 -08:00
René Scharfe	402bf8e198	diff: use SWAP macro Use the macro SWAP to exchange the value of pairs of variables instead of swapping them manually with the help of a temporary variable. The resulting code is shorter and easier to read. The two cases were not transformed by the semantic patch swap.cocci because it's extra careful and handles only cases where the types of all variables are the same -- and here we swap two ints and use an unsigned temporary variable for that. Nevertheless the conversion is safe, as the value range is preserved with and without the patch. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:23:00 -08:00
René Scharfe	35d803bc9a	use SWAP macro Apply the semantic patch swap.cocci to convert hand-rolled swaps to use the macro SWAP. The resulting code is shorter and easier to read, the object code is effectively unchanged. The patch for object.c had to be hand-edited in order to preserve the comment before the change; Coccinelle tried to eat it for some reason. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:17:00 -08:00
Vegard Nossum	c488867793	diff: add interhunk context config option The --inter-hunk-context= option was added in commit `6d0e674a57` ("diff: add option to show context between close hunks"). This patch allows configuring a default for this option. Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-12 12:55:43 -08:00
Junio C Hamano	2ced5f2c2d	Merge branch 'jc/retire-compaction-heuristics' "git diff" and its family had two experimental heuristics to shift the contents of a hunk to make the patch easier to read. One of them turns out to be better than the other, so leave only the "--indent-heuristic" option and remove the other one. * jc/retire-compaction-heuristics: diff: retire "compaction" heuristics	2017-01-10 15:24:27 -08:00
Junio C Hamano	3cde4e02ee	diff: retire "compaction" heuristics When a patch inserts a block of lines, whose last lines are the same as the existing lines that appear before the inserted block, "git diff" can choose any place between these existing lines as the boundary between the pre-context and the added lines (adjusting the end of the inserted block as appropriate) to come up with variants of the same patch, and some variants are easier to read than others. We have been trying to improve the choice of this boundary, and Git 2.11 shipped with an experimental "compaction-heuristic". Since then another attempt to improve the logic further resulted in a new "indent-heuristic" logic. It is agreed that the latter gives better result overall, and the former outlived its usefulness. Retire "compaction", and keep "indent" as an experimental feature. The latter hopefully will be turned on by default in a future release, but that should be done as a separate step. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-23 12:32:22 -08:00
Jack Bates	43d1948b7b	diff: handle --no-abbrev in no-index case There are two different places where the --no-abbrev option is parsed, and two different places where SHA-1s are abbreviated. We normally parse --no-abbrev with setup_revisions(), but in the no-index case, "git diff" calls diff_opt_parse() directly, and diff_opt_parse() didn't handle --no-abbrev until now. (It did handle --abbrev, however.) We normally abbreviate SHA-1s with find_unique_abbrev(), but commit `4f03666` ("diff: handle sha1 abbreviations outside of repository, 2016-10-20) recently introduced a special case when you run "git diff" outside of a repository. setup_revisions() does also call diff_opt_parse(), but not for --abbrev or --no-abbrev, which it handles itself. setup_revisions() sets rev_info->abbrev, and later copies that to diff_options->abbrev. It handles --no-abbrev by setting abbrev to zero. (This change doesn't touch that.) Setting abbrev to zero was broken in the outside-of-a-repository special case, which until now resulted in a truly zero-length SHA-1, rather than taking zero to mean do not abbreviate. The only way to trigger this bug, however, was by running "git diff --raw" without either the --abbrev or --no-abbrev options, because 1) without --raw it doesn't respect abbrev (which is bizarre, but has been that way forever), 2) we silently clamp --abbrev=0 to MINIMUM_ABBREV, and 3) --no-abbrev wasn't handled until now. The outside-of-a-repository case is one of three no-index cases. The other two are when one of the files you're comparing is outside of the repository you're in, and the --no-index option. Signed-off-by: Jack Bates <jack@nottheoilrig.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-08 14:40:30 -08:00
Junio C Hamano	6d40812e4b	Merge branch 'tk/diffcore-delta-remove-unused' Code cleanup. * tk/diffcore-delta-remove-unused: diffcore-delta: remove unused parameter to diffcore_count_changes()	2016-11-17 13:45:22 -08:00
Tobias Klauser	974e0044d6	diffcore-delta: remove unused parameter to diffcore_count_changes() The delta_limit parameter to diffcore_count_changes() has been unused since commit `ba23bbc8e` ("diffcore-delta: make change counter to byte oriented again.", 2006-03-04). Remove the parameter and adjust all callers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-14 09:24:04 -08:00
Junio C Hamano	650360210a	Merge branch 'nd/ita-empty-commit' When new paths were added by "git add -N" to the index, it was enough to circumvent the check by "git commit" to refrain from making an empty commit without "--allow-empty". The same logic prevented "git status" to show such a path as "new file" in the "Changes not staged for commit" section. * nd/ita-empty-commit: commit: don't be fooled by ita entries when creating initial commit commit: fix empty commit creation when there's no changes but ita entries diff: add --ita-[in]visible-in-index diff-lib: allow ita entries treated as "not yet exist in index"	2016-10-27 14:58:50 -07:00
Junio C Hamano	0d9c527d59	Merge branch 'jk/no-looking-at-dotgit-outside-repo' Update "git diff --no-index" codepath not to try to peek into .git/ directory that happens to be under the current directory, when we know we are operating outside any repository. * jk/no-looking-at-dotgit-outside-repo: diff: handle sha1 abbreviations outside of repository diff_aligned_abbrev: use "struct oid" diff_unique_abbrev: rename to diff_aligned_abbrev find_unique_abbrev: use 4-buffer ring test-*-cache-tree: setup git dir read info/{attributes,exclude} only when in repository	2016-10-27 14:58:48 -07:00
Junio C Hamano	580d820ece	Merge branch 'lt/abbrev-auto' Allow the default abbreviation length, which has historically been 7, to scale as the repository grows. The logic suggests to use 12 hexdigits for the Linux kernel, and 9 to 10 for Git itself. * lt/abbrev-auto: abbrev: auto size the default abbreviation abbrev: prepare for new world order abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing	2016-10-27 14:58:47 -07:00
Jeff King	4f03666ac6	diff: handle sha1 abbreviations outside of repository When generating diffs outside a repository (e.g., with "diff --no-index"), we may write abbreviated sha1s as part of "--raw" output or the "index" lines of "--patch" output. Since we have no object database, we never find any collisions, and these sha1s get whatever static abbreviation length is configured (typically 7). However, we do blindly look in ".git/objects" to see if any objects exist, even though we know we are not in a repository. This is usually harmless because such a directory is unlikely to exist, but could be wrong in rare circumstances. Let's instead notice when we are not in a repository and behave as if the object database is empty (i.e., just use the default abbrev length). It would perhaps make sense to be conservative and show full sha1s in that case, but showing the default abbreviation is what we've always done (and is certainly less ugly). Note that this does mean that: cd /not/a/repo GIT_OBJECT_DIRECTORY=/some/real/objdir git diff --no-index ... used to look for collisions in /some/real/objdir but now does not. This could be considered either a bugfix (we do not look at objects if we have no repository) or a regression, but it seems unlikely that anybody would care much either way. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Jeff King	d6cece51b8	diff_aligned_abbrev: use "struct oid" Since we're modifying this function anyway, it's a good time to update it to the more modern "struct oid". We can also drop some of the magic numbers in favor of GIT_SHA1_HEXSZ, along with some descriptive comments. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Jeff King	d5e3b01e5b	diff_unique_abbrev: rename to diff_aligned_abbrev The word "align" describes how the function actually differs from find_unique_abbrev, and will make it less confusing when we add more diff-specific abbrevation functions that do not do this alignment. Since this is a globally available function, let's also move its descriptive comment to the header file, where we typically document function interfaces. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Junio C Hamano	a5ed26702b	Merge branch 'va/i18n' More i18n. * va/i18n: i18n: diff: mark warnings for translation i18n: credential-cache--daemon: mark advice for translation i18n: convert mark error messages for translation i18n: apply: mark error message for translation i18n: apply: mark error messages for translation i18n: apply: mark info messages for translation i18n: apply: mark plural string for translation	2016-10-26 13:14:47 -07:00
Junio C Hamano	e5272d304a	Merge branch 'jc/ws-error-highlight' "git diff/log --ws-error-highlight=<kind>" lacked the corresponding configuration variable to set it by default. * jc/ws-error-highlight: diff: introduce diff.wsErrorHighlight option diff.c: move ws-error-highlight parsing helpers up diff.c: refactor parse_ws_error_highlight() t4015: split out the "setup" part of ws-error-highlight test	2016-10-26 13:14:43 -07:00
Junio C Hamano	c334effa23	Merge branch 'jc/diff-unique-abbrev-comments' A bit more comments in a tricky code. * jc/diff-unique-abbrev-comments: diff_unique_abbrev(): document its assumption and limitation	2016-10-26 13:14:42 -07:00
Nguyễn Thái Ngọc Duy	b42b451919	diff: add --ita-[in]visible-in-index The option --ita-invisible-in-index exposes the "ita_invisible_in_index" diff flag to outside to allow easier experimentation with this new mode. The "plan" is to make --ita-invisible-in-index default to keep consistent behavior with 'status' and 'commit', but a bunch other commands like 'apply', 'merge', 'reset'.... need to be taken into consideration as well. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-24 10:47:51 -07:00
Vasco Almeida	db424979a8	i18n: diff: mark warnings for translation Mark rename_limit_warning and degrade_cc_to_c_warning and rename_limit_warning for translation. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 14:51:48 -07:00
Junio C Hamano	b8688adb12	Merge branch 'rs/qsort' We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of the time third parameter is redundant. A new QSORT() macro lets us omit it. * rs/qsort: show-branch: use QSORT use QSORT, part 2 coccicheck: use --all-includes by default remove unnecessary check before QSORT use QSORT add QSORT	2016-10-10 14:03:46 -07:00
Junio C Hamano	f0798e6cdb	Merge branch 'rs/cocci' Code clean-up with help from coccinelle tool continues. * rs/cocci: coccicheck: make transformation for strbuf_addf(sb, "...") more precise use strbuf_add_unique_abbrev() for adding short hashes, part 2 use strbuf_addstr() instead of strbuf_addf() with "%s", part 2 gitignore: ignore output files of coccicheck make target	2016-10-06 14:53:12 -07:00
Junio C Hamano	a17505f262	diff: introduce diff.wsErrorHighlight option With the preparatory steps, it has become trivial to teach the system a new diff.wsErrorHighlight configuration that gives the default value for --ws-error-highlight command line option. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-04 15:49:05 -07:00
Junio C Hamano	0b4b42e7fe	diff.c: move ws-error-highlight parsing helpers up These need to be usable from git_diff_ui_config() code to help parsing a configuration variable, so move them up. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-04 15:49:05 -07:00
Junio C Hamano	077965f84a	diff.c: refactor parse_ws_error_highlight() Rename the function to parse_ws_error_highlight_opt(), because it is meant to parse a command line option, and then refactor the meat of the function into a helper function that reports the parsed result which is typically a small unsigned int (these are OR'ed bitmask after all), or a negative offset that indicates where in the input string a parse error happened. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-04 15:49:05 -07:00
Junio C Hamano	7b5b7721af	abbrev: prepare for new world order The code that sets custom abbreviation length, in response to command line argument, often does something like this: if (skip_prefix(arg, "--abbrev=", &arg)) abbrev = atoi(arg); else if (!strcmp("--abbrev", &arg)) abbrev = DEFAULT_ABBREV; /* make the value sane */ if (abbrev < 0 \|\| 40 < abbrev) abbrev = ... some sane value ... However, it is pointless to sanity-check and tweak the value obtained from DEFAULT_ABBREV. We are going to allow it to be initially set to -1 to signal that the default abbreviation length must be auto sized upon the first request to abbreviate, based on the number of objects in the repository, and when that happens, rejecting or tweaking a negative value to a "saner" one will negatively interfere with the auto sizing. The codepaths for git rev-parse --short <object> git diff --raw --abbrev do exactly that; allow them to pass possibly negative abbrevs intact, that will come from DEFAULT_ABBREV in the future. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-03 12:54:22 -07:00
Junio C Hamano	d709f1fb9d	diff_unique_abbrev(): document its assumption and limitation This function is used to add "..." to displayed object names in "diff --raw --abbrev[=<n>]" output. It bases its behaviour on an untold assumption that the abbreviation length requested by the caller is "reasonble", i.e. most of the objects will abbreviate within the requested length and the resulting length would never exceed it by more than a few hexdigits (otherwise the resulting columns would not align). Explain that in a comment. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-30 18:06:50 -07:00
Junio C Hamano	300e95f7df	Merge branch 'js/regexec-buf' into maint Some codepaths in "git diff" used regexec(3) on a buffer that was mmap(2)ed, which may not have a terminating NUL, leading to a read beyond the end of the mapped region. This was fixed by introducing a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND extension. * js/regexec-buf: regex: use regexec_buf() regex: add regexec_buf() that can work on a non NUL-terminated string regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails	2016-09-29 16:49:45 -07:00
René Scharfe	9ed0d8d6e6	use QSORT Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-29 15:42:18 -07:00
René Scharfe	f937d78553	use strbuf_add_unique_abbrev() for adding short hashes, part 2 Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs instead of taking detours through find_unique_abbrev() and its static buffer. This is shorter and a bit more efficient. `1eb47f167d` already converted six cases, this patch covers three more. A semantic patch for Coccinelle is included for easier checking for new cases that might be introduced in the future. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-27 14:02:40 -07:00
Junio C Hamano	6a67695268	Merge branch 'js/regexec-buf' Some codepaths in "git diff" used regexec(3) on a buffer that was mmap(2)ed, which may not have a terminating NUL, leading to a read beyond the end of the mapped region. This was fixed by introducing a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND extension. * js/regexec-buf: regex: use regexec_buf() regex: add regexec_buf() that can work on a non NUL-terminated string regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails	2016-09-26 16:09:19 -07:00
Junio C Hamano	8969feac7e	Merge branch 'va/i18n-more' Even more i18n. * va/i18n-more: i18n: stash: mark messages for translation i18n: notes-merge: mark die messages for translation i18n: ident: mark hint for translation i18n: i18n: diff: mark die messages for translation i18n: connect: mark die messages for translation i18n: commit: mark message for translation	2016-09-26 16:09:18 -07:00
Junio C Hamano	b7af6ae5cf	Merge branch 'mh/diff-indent-heuristic' Output from "git diff" can be made easier to read by selecting which lines are common and which lines are added/deleted intelligently when the lines before and after the changed section are the same. A command line option is added to help with the experiment to find a good heuristics. * mh/diff-indent-heuristic: blame: honor the diff heuristic options and config parse-options: add parse_opt_unknown_cb() diff: improve positioning of add/delete blocks in diffs xdl_change_compact(): introduce the concept of a change group recs_match(): take two xrecord_t pointers as arguments is_blank_line(): take a single xrecord_t as argument xdl_change_compact(): only use heuristic if group can't be matched xdl_change_compact(): fix compaction heuristic to adjust ixo	2016-09-26 16:09:16 -07:00
Johannes Schindelin	b7d36ffca0	regex: use regexec_buf() The new regexec_buf() function operates on buffers with an explicitly specified length, rather than NUL-terminated strings. We need to use this function whenever the buffer we want to pass to regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated). Note: the original motivation for this patch was to fix a bug where `git diff -G <regex>` would crash. This patch converts more callers, though, some of which allocated to construct NUL-terminated strings, or worse, modified buffers to temporarily insert NULs while calling regexec(3). By converting them to use regexec_buf(), the code has become much cleaner. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-21 13:56:15 -07:00
Jean-Noël AVILA	a2f05c9454	i18n: i18n: diff: mark die messages for translation While marking individual messages for translation, consolidate some messages "option 'foo' requires a value" that is used for many options into one by introducing a helper function to die with the message with the option name embedded in it, and ask the translators to localize that single message instead. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Jean-Noel Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-21 10:18:33 -07:00
Junio C Hamano	4af9a7d344	Merge branch 'bc/object-id' The "unsigned char sha1[20]" to "struct object_id" conversion continues. Notable changes in this round includes that ce->sha1, i.e. the object name recorded in the cache_entry, turns into an object_id. It had merge conflicts with a few topics in flight (Christian's "apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's "status --porcelain-v2"). Extra sets of eyes double-checking for mismerges are highly appreciated. * bc/object-id: builtin/reset: convert to use struct object_id builtin/commit-tree: convert to struct object_id builtin/am: convert to struct object_id refs: add an update_ref_oid function. sha1_name: convert get_sha1_mb to struct object_id builtin/update-index: convert file to struct object_id notes: convert init_notes to use struct object_id builtin/rm: convert to use struct object_id builtin/blame: convert file to use struct object_id Convert read_mmblob to take struct object_id. notes-merge: convert struct notes_merge_pair to struct object_id builtin/checkout: convert some static functions to struct object_id streaming: make stream_blob_to_fd take struct object_id builtin: convert textconv_object to use struct object_id builtin/cat-file: convert some static functions to struct object_id builtin/cat-file: convert struct expand_data to use struct object_id builtin/log: convert some static functions to use struct object_id builtin/blame: convert struct origin to use struct object_id builtin/apply: convert static functions to struct object_id cache: convert struct cache_entry to use struct object_id	2016-09-19 13:47:19 -07:00
Michael Haggerty	5b162879e9	blame: honor the diff heuristic options and config Teach "git blame" and "git annotate" the --compaction-heuristic and --indent-heuristic options that are now supported by "git diff". Also teach them to honor the `diff.compactionHeuristic` and `diff.indentHeuristic` configuration options. It would be conceivable to introduce separate configuration options for "blame" and "annotate"; for example `blame.compactionHeuristic` and `blame.indentHeuristic`. But it would be confusing to users if blame output is inconsistent with diff output, so it makes more sense for them to respect the same configuration. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-19 10:25:11 -07:00
Michael Haggerty	433860f3d0	diff: improve positioning of add/delete blocks in diffs Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+\|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit `d634d61ed6` improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: \| repository \| count \| Git 2.9.0 \| compaction \| compaction-fixed \| indent-1 \| indent-2 \| \| --------------------- \| ----- \| -------------- \| -------------- \| ---------------- \| -------------- \| -------------- \| \| afnetworking \| 109 \| 89 (81.7%) \| 37 (33.9%) \| 37 (33.9%) \| 2 (1.8%) \| 2 (1.8%) \| \| alamofire \| 30 \| 18 (60.0%) \| 14 (46.7%) \| 15 (50.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| angular \| 184 \| 127 (69.0%) \| 39 (21.2%) \| 23 (12.5%) \| 5 (2.7%) \| 5 (2.7%) \| \| animate \| 313 \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| \| ant \| 380 \| 356 (93.7%) \| 152 (40.0%) \| 148 (38.9%) \| 15 (3.9%) \| 15 (3.9%) \| * \| bugzilla \| 306 \| 263 (85.9%) \| 109 (35.6%) \| 99 (32.4%) \| 14 (4.6%) \| 15 (4.9%) \| * \| corefx \| 126 \| 91 (72.2%) \| 22 (17.5%) \| 21 (16.7%) \| 6 (4.8%) \| 6 (4.8%) \| \| couchdb \| 78 \| 44 (56.4%) \| 26 (33.3%) \| 28 (35.9%) \| 6 (7.7%) \| 6 (7.7%) \| * \| cpython \| 937 \| 158 (16.9%) \| 50 (5.3%) \| 49 (5.2%) \| 5 (0.5%) \| 5 (0.5%) \| * \| discourse \| 160 \| 95 (59.4%) \| 42 (26.2%) \| 36 (22.5%) \| 18 (11.2%) \| 13 (8.1%) \| \| docker \| 307 \| 194 (63.2%) \| 198 (64.5%) \| 253 (82.4%) \| 8 (2.6%) \| 8 (2.6%) \| * \| electron \| 163 \| 132 (81.0%) \| 38 (23.3%) \| 39 (23.9%) \| 6 (3.7%) \| 6 (3.7%) \| \| git \| 536 \| 470 (87.7%) \| 73 (13.6%) \| 78 (14.6%) \| 16 (3.0%) \| 16 (3.0%) \| * \| gitflow \| 127 \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| ionic \| 133 \| 89 (66.9%) \| 29 (21.8%) \| 38 (28.6%) \| 1 (0.8%) \| 1 (0.8%) \| \| ipython \| 482 \| 362 (75.1%) \| 167 (34.6%) \| 169 (35.1%) \| 11 (2.3%) \| 11 (2.3%) \| * \| junit \| 161 \| 147 (91.3%) \| 67 (41.6%) \| 66 (41.0%) \| 1 (0.6%) \| 1 (0.6%) \| * \| lighttable \| 15 \| 5 (33.3%) \| 0 (0.0%) \| 2 (13.3%) \| 0 (0.0%) \| 0 (0.0%) \| \| magit \| 88 \| 75 (85.2%) \| 11 (12.5%) \| 9 (10.2%) \| 1 (1.1%) \| 0 (0.0%) \| \| neural-style \| 28 \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| nodejs \| 781 \| 649 (83.1%) \| 118 (15.1%) \| 111 (14.2%) \| 4 (0.5%) \| 5 (0.6%) \| * \| phpmyadmin \| 491 \| 481 (98.0%) \| 75 (15.3%) \| 48 (9.8%) \| 2 (0.4%) \| 2 (0.4%) \| * \| react-native \| 168 \| 130 (77.4%) \| 79 (47.0%) \| 81 (48.2%) \| 0 (0.0%) \| 0 (0.0%) \| \| rust \| 171 \| 128 (74.9%) \| 30 (17.5%) \| 27 (15.8%) \| 16 (9.4%) \| 14 (8.2%) \| \| spark \| 186 \| 149 (80.1%) \| 52 (28.0%) \| 52 (28.0%) \| 2 (1.1%) \| 2 (1.1%) \| \| tensorflow \| 115 \| 66 (57.4%) \| 48 (41.7%) \| 48 (41.7%) \| 5 (4.3%) \| 5 (4.3%) \| \| test-more \| 19 \| 15 (78.9%) \| 2 (10.5%) \| 2 (10.5%) \| 1 (5.3%) \| 1 (5.3%) \| * \| test-unit \| 51 \| 34 (66.7%) \| 14 (27.5%) \| 8 (15.7%) \| 2 (3.9%) \| 2 (3.9%) \| * \| xmonad \| 23 \| 22 (95.7%) \| 2 (8.7%) \| 2 (8.7%) \| 1 (4.3%) \| 1 (4.3%) \| * \| --------------------- \| ----- \| -------------- \| -------------- \| ---------------- \| -------------- \| -------------- \| \| totals \| 6668 \| 4391 (65.9%) \| 1496 (22.4%) \| 1491 (22.4%) \| 150 (2.2%) \| 144 (2.2%) \| \| totals (training set) \| 4552 \| 3195 (70.2%) \| 1053 (23.1%) \| 1061 (23.3%) \| 86 (1.9%) \| 88 (1.9%) \| \| totals (test set) \| 2116 \| 1196 (56.5%) \| 443 (20.9%) \| 430 (20.3%) \| 64 (3.0%) \| 56 (2.6%) \| In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got wrong. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. This patch adds a new command-line option `--indent-heuristic`, and a new configuration setting `diff.indentHeuristic`, that activate this heuristic. This interface is only meant for testing purposes, and should be finalized before including this change in any release. [1] https://github.com/mhagger/diff-slider-tools Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-19 10:25:11 -07:00
Junio C Hamano	a0d9b7f015	Merge branch 'sb/diff-cleanup' Code cleanup. * sb/diff-cleanup: diff: remove dead code diff: omit found pointer from emit_callback diff.c: use diff_options directly	2016-09-15 14:11:16 -07:00
Junio C Hamano	305d7f1339	Merge branch 'jk/diff-submodule-diff-inline' The "git diff --submodule={short,log}" mechanism has been enhanced to allow "--submodule=diff" to show the patch between the submodule commits bound to the superproject. * jk/diff-submodule-diff-inline: diff: teach diff to display submodule difference with an inline diff submodule: refactor show_submodule_summary with helper function submodule: convert show_submodule_summary to use struct object_id * allow do_submodule_path to work even if submodule isn't checked out diff: prepare for additional submodule formats graph: add support for --line-prefix on all graph-aware output diff.c: remove output_prefix_length field cache: add empty_tree_oid object and helper function	2016-09-12 15:34:31 -07:00
Stefan Beller	ca9b37e5a8	diff: remove dead code When `len < 1`, len has to be 0 or negative, emit_line will then remove the first character and by then `len` would be negative. As this doesn't happen, it is safe to assume it is dead code. This continues to simplify the code, which was started in `b8d9c1a66b` (2009-09-03, diff.c: the builtin_diff() deals with only two-file comparison). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-08 13:54:37 -07:00
Stefan Beller	ba16233ccd	diff: omit found pointer from emit_callback We keep the actual data in the diff options, which are just as accessible. Remove the pointer stored in struct emit_callback for readability. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-08 13:54:23 -07:00
Stefan Beller	fb33b62ca6	diff.c: use diff_options directly The value of `ecbdata->opt` is accessible via the short variable `o` already, so let's use that instead. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-08 13:46:46 -07:00
brian m. carlson	99d1a9861a	cache: convert struct cache_entry to use struct object_id Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
Jacob Keller	fd47ae6a5b	diff: teach diff to display submodule difference with an inline diff Teach git-diff and friends a new format for displaying the difference of a submodule. The new format is an inline diff of the contents of the submodule between the commit range of the update. This allows the user to see the actual code change caused by a submodule update. Add tests for the new format and option. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:10 -07:00
Jacob Keller	602a283afb	submodule: convert show_submodule_summary to use struct object_id * Since we're going to be changing this function in a future patch, lets go ahead and convert this to use object_id now. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:10 -07:00
Jacob Keller	61cfbc054d	diff: prepare for additional submodule formats A future patch will add a new format for displaying the difference of a submodule. Make it easier by changing how we store the current selected format. Replace the DIFF_OPT flag with an enumeration, as each format will be mutually exclusive. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:09 -07:00
Jacob Keller	660e113ce1	graph: add support for --line-prefix on all graph-aware output Add an extension to git-diff and git-log (and any other graph-aware displayable output) such that "--line-prefix=<string>" will print the additional line-prefix on every line of output. To make this work, we have to fix a few bugs in the graph API that force graph_show_commit_msg to be used only when you have a valid graph. Additionally, we extend the default_diff_output_prefix handler to work even when no graph is enabled. This is somewhat of a hack on top of the graph API, but I think it should be acceptable here. This will be used by a future extension of submodule display which displays the submodule diff as the actual diff between the pre and post commit in the submodule project. Add some tests for both git-log and git-diff to ensure that the prefix is honored correctly. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:09 -07:00
Junio C Hamano	cd48dadb8d	diff.c: remove output_prefix_length field "diff/log --stat" has a logic that determines the display columns available for the diffstat part of the output and apportions it for pathnames and diffstat graph automatically. `5e71a84a` (Add output_prefix_length to diff_options, 2012-04-16) added the output_prefix_length field to diff_options structure to allow this logic to subtract the display columns used for the history graph part from the total "terminal width"; this matters when the "git log --graph -p" option is in use. The field must be set to the number of display columns needed to show the output from the output_prefix() callback, which is error prone. As there is only one user of the field, and the user has the actual value of the prefix string, let's get rid of the field and have the user count the display width itself. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:08 -07:00
Junio C Hamano	dd610aeda6	Merge branch 'kw/patch-ids-optim' When "git rebase" tries to compare set of changes on the updated upstream and our own branch, it computes patch-id for all of these changes and attempts to find matches. This has been optimized by lazily computing the full patch-id (which is expensive) to be compared only for changes that touch the same set of paths. * kw/patch-ids-optim: rebase: avoid computing unnecessary patch IDs patch-ids: add flag to create the diff patch id using header only data patch-ids: replace the seen indicator with a commit pointer patch-ids: stop using a hand-rolled hashmap implementation	2016-08-12 09:47:39 -07:00
Junio C Hamano	cee6c5b47b	Merge branch 'jk/diff-do-not-reuse-wtf-needs-cleaning' into maint There is an optimization used in "git diff $treeA $treeB" to borrow an already checked-out copy in the working tree when it is known to be the same as the blob being compared, expecting that open/mmap of such a file is faster than reading it from the object store, which involves inflating and applying delta. This however kicked in even when the checked-out copy needs to go through the convert-to-git conversion (including the clean filter), which defeats the whole point of the optimization. The optimization has been disabled when the conversion is necessary. * jk/diff-do-not-reuse-wtf-needs-cleaning: diff: do not reuse worktree files that need "clean" conversion	2016-08-10 11:55:28 -07:00
Junio C Hamano	767da54bf8	Merge branch 'jk/diff-do-not-reuse-wtf-needs-cleaning' There is an optimization used in "git diff $treeA $treeB" to borrow an already checked-out copy in the working tree when it is known to be the same as the blob being compared, expecting that open/mmap of such a file is faster than reading it from the object store, which involves inflating and applying delta. This however kicked in even when the checked-out copy needs to go through the convert-to-git conversion (including the clean filter), which defeats the whole point of the optimization. The optimization has been disabled when the conversion is necessary. * jk/diff-do-not-reuse-wtf-needs-cleaning: diff: do not reuse worktree files that need "clean" conversion	2016-08-03 15:10:29 -07:00
Kevin Willford	3e8e32c32e	patch-ids: add flag to create the diff patch id using header only data This will allow a diff patch id to be created using only the header data so that the contents of the file will not have to be loaded. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 14:10:01 -07:00
Jeff King	06dec439a3	diff: do not reuse worktree files that need "clean" conversion When accessing a blob for a diff, we may try to reuse file contents in the working tree, under the theory that it is faster to mmap those file contents than it would be to extract the content from the object database. When we have to filter those contents, though, that assumption does not hold. Even for our internal conversions like CRLF, we have to allocate and fill a new buffer anyway. But much worse, for external clean filters we have to exec an arbitrary script, and we have no idea how expensive it may be to run. So let's skip this optimization when conversion into git's "clean" form is required. This applies whenever the "want_file" flag is false. When it's true, the caller actually wants the smudged worktree contents, which the reused file by definition already has (in fact, this is a key optimization going the other direction, since reusing the worktree file there lets us skip smudge filters). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-22 12:31:24 -07:00
Junio C Hamano	a63d31b4d3	Merge branch 'bc/cocci' Conversion from unsigned char sha1[20] to struct object_id continues. * bc/cocci: diff: convert prep_temp_blob() to struct object_id merge-recursive: convert merge_recursive_generic() to object_id merge-recursive: convert leaf functions to use struct object_id merge-recursive: convert struct merge_file_info to object_id merge-recursive: convert struct stage_data to use object_id diff: rename struct diff_filespec's sha1_valid member diff: convert struct diff_filespec to struct object_id coccinelle: apply object_id Coccinelle transformations coccinelle: convert hashcpy() with null_sha1 to hashclr() contrib/coccinelle: add basic Coccinelle transforms hex: add oid_to_hex_r()	2016-07-19 13:22:16 -07:00
brian m. carlson	09bdff29e1	diff: convert prep_temp_blob() to struct object_id All of the callers of this function use struct object_id, so convert it to use struct object_id in its arguments and internally. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
brian m. carlson	41c9560ee5	diff: rename struct diff_filespec's sha1_valid member Now that this struct's sha1 member is called "oid", update the comment and the sha1_valid member to be called "oid_valid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1_valid + o.oid_valid @@ struct diff_filespec *p; @@ - p->sha1_valid + p->oid_valid Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
brian m. carlson	a0d12c4433	diff: convert struct diff_filespec to struct object_id Convert struct diff_filespec's sha1 member to use a struct object_id called "oid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1 + o.oid.hash @@ struct diff_filespec *p; @@ - p->sha1 + p->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
brian m. carlson	f449198e58	coccinelle: convert hashcpy() with null_sha1 to hashclr() hashcpy with null_sha1 as the source is equivalent to hashclr. In addition to being simpler, using hashclr may give the compiler a chance to optimize better. Convert instances of hashcpy with the source argument of null_sha1 to hashclr. This transformation was implemented using the following semantic patch: @@ expression E1; @@ -hashcpy(E1, null_sha1); +hashclr(E1); Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
Johannes Schindelin	afc676f2c9	diff: do not color output when --color=auto and --output=<file> is given "git diff --output=<file> --color=auto" used to show the ANSI color sequence in the resulting file when the standard output is connected to a terminal, because --color=auto check always checks the standard output, not the actual file that receives the output. We could correct this by using freopen(3) to redirect the standard output to the specified file, which is in like with how format-patch used to match the world order, but following the same reasoning as the earlier "format-patch: explicitly switch off color when writing to files", let's be more strict by bypassing the "auto" check when the --output=<file> option is in use. Strictly speaking, this is a backwards-incompatible change, but it is highly unlikely that any user would want to see ANSI color sequences in a file. The reason this was not caught earlier is most likely that either --output=<file> is not used, or only when stdout is redirected anyway. Users can still give --color=always if they want a colored diff in the resulting file. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:26:47 -07:00
Junio C Hamano	e5f7675544	Merge branch 'jk/diff-compact-heuristic' It turns out that the earlier effort to update the heuristics may want to use a bit more time to mature. Turn it off by default. * jk/diff-compact-heuristic: diff: disable compaction heuristic for now	2016-06-10 15:26:06 -07:00
Junio C Hamano	5580b271af	diff: disable compaction heuristic for now http://lkml.kernel.org/g/20160610075043.GA13411@sigill.intra.peff.net reports that a change to add a new "function" with common ending with the existing one at the end of the file is shown like this: def foo do_foo_stuff() + common_ending() +end + +def bar + do_bar_stuff() + common_ending() end when the new heuristic is in use. In reality, the change is to add the blank line before "def bar" and everything below, which is what the code without the new heuristic shows. Disable the heuristics by default, and resurrect the documentation for the option and the configuration variables, while clearly marking the feature as still experimental. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-10 13:45:23 -07:00
Junio C Hamano	0018da1088	Merge branch 'jk/diff-compact-heuristic' Patch output from "git diff" and friends has been tweaked to be more readable by using a blank line as a strong hint that the contents before and after it belong to a logically separate unit. * jk/diff-compact-heuristic: diff: undocument the compaction heuristic knobs for experimentation xdiff: implement empty line chunk heuristic xdiff: add recs_match helper function	2016-05-06 14:45:46 -07:00
Stefan Beller	d634d61ed6	xdiff: implement empty line chunk heuristic In order to produce the smallest possible diff and combine several diff hunks together, we implement a heuristic from GNU Diff which moves diff hunks forward as far as possible when we find common context above and below a diff hunk. This sometimes produces less readable diffs when writing C, Shell, or other programming languages, ie: ... /* + * + * + / + +/ ... instead of the more readable equivalent of ... +/* + * + * + / + / ... Implement the following heuristic to (optionally) produce the desired output. If there are diff chunks which can be shifted around, shift each hunk such that the last common empty line is below the chunk with the rest of the context above. This heuristic appears to resolve the above example and several other common issues without producing significantly weird results. However, as with any heuristic it is not really known whether this will always be more optimal. Thus, it can be disabled via diff.compactionHeuristic. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-19 10:53:34 -07:00

1 2 3 4 5 ...

1110 Commits