Make the diffstat interface (namely, the diffstat_t struct and
compute_diffstat) no longer be internal to diff.c and allow it to be used
by other parts of git.
This is helpful for code that may want to easily extract information
from files using the diff machinery, while flushing it differently from
how the show_* functions used by diff_flush() do it. One example is the
builtin implementation of git-add--interactive's status.
Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
Signed-off-by: Slavica Đukić <slawica92@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Preparation for SHA-256 upgrade continues.
* bc/object-id-part17: (26 commits)
midx: switch to using the_hash_algo
builtin/show-index: replace sha1_to_hex
rerere: replace sha1_to_hex
builtin/receive-pack: replace sha1_to_hex
builtin/index-pack: replace sha1_to_hex
packfile: replace sha1_to_hex
wt-status: convert struct wt_status to object_id
cache: remove null_sha1
builtin/worktree: switch null_sha1 to null_oid
builtin/repack: write object IDs of the proper length
pack-write: use hash_to_hex when writing checksums
sequencer: convert to use the_hash_algo
bisect: switch to using the_hash_algo
sha1-lookup: switch hard-coded constants to the_hash_algo
config: use the_hash_algo in abbrev comparison
combine-diff: replace GIT_SHA1_HEXSZ with the_hash_algo
bundle: switch to use the_hash_algo
connected: switch GIT_SHA1_HEXSZ to the_hash_algo
show-index: switch hard-coded constants to the_hash_algo
blame: remove needless comparison with GIT_SHA1_HEXSZ
...
Since these macros already take a `keyvar' pointer of a known type,
we can rely on OFFSETOF_VAR to get the correct offset without
relying on non-portable `__typeof__' and `offsetof'.
Argument order is also rearranged, so `keyvar' and `member' are
sequential as they are used as: `keyvar->member'
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we cannot rely on a `__typeof__' operator being portable
to use with `offsetof'; we can calculate the pointer offset
using an existing pointer and the address of a member using
pointer arithmetic for compilers without `__typeof__'.
This allows us to simplify usage of hashmap iterator macros
by not having to specify a type when a pointer of that type
is already given.
In the future, list iterator macros (e.g. list_for_each_entry)
may also be implemented using OFFSETOF_VAR to save hackers the
trouble of using container_of/list_entry macros and without
relying on non-portable `__typeof__'.
v3: use `__typeof__' to avoid clang warnings
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`hashmap_free_entries' behaves like `container_of' and passes
the offset of the hashmap_entry struct to the internal
`hashmap_free_' function, allowing the function to free any
struct pointer regardless of where the hashmap_entry field
is located.
`hashmap_free' no longer takes any arguments aside from
the hashmap itself.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another step in eliminating the requirement of hashmap_entry
being the first member of a struct.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using `container_of' can be verbose and choosing names for
intermediate "struct hashmap_entry" pointers is a hard problem.
So introduce "*_entry" APIs inspired by similar linked-list
APIs in the Linux kernel.
Unfortunately, `__typeof__' is not portable C, so we need an
extra parameter to specify the type.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a step towards removing the requirement for
hashmap_entry being the first field of a struct.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "const void *" as the compiler
now detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "void *" as the compiler now
detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "const void *" as the compiler
now detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise, the hashmap_entry.next field appears to remain
uninitialized, which can lead to problems when
add_lines_to_move_detection calls hashmap_add.
I found this through manual inspection when converting
hashmap_add callers to take "struct hashmap_entry *".
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the lazy clone machinery that there can be more than one
promisor remote and consult them in order when downloading missing
objects on demand.
* cc/multi-promisor:
Move core_partial_clone_filter_default to promisor-remote.c
Move repository_format_partial_clone to promisor-remote.c
Remove fetch-object.{c,h} in favor of promisor-remote.{c,h}
remote: add promisor and partial clone config to the doc
partial-clone: add multiple remotes in the doc
t0410: test fetching from many promisor remotes
builtin/fetch: remove unique promisor remote limitation
promisor-remote: parse remote.*.partialclonefilter
Use promisor_remote_get_direct() and has_promisor_remote()
promisor-remote: use repository_format_partial_clone
promisor-remote: add promisor_remote_reinit()
promisor-remote: implement promisor_remote_get_direct()
Add initial support for many promisor remotes
fetch-object: make functions return an error code
t0410: remove pipes after git commands
On-demand object fetching in lazy clone incorrectly tried to fetch
commits from submodule projects, while still working in the
superproject, which has been corrected.
* jt/diff-lazy-fetch-submodule-fix:
diff: skip GITLINK when lazy fetching missing objs
In 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08),
diff was taught to batch the fetching of missing objects when operating
on a partial clone, but was not taught to refrain from fetching
GITLINKs. Teach diff to check if an object is a GITLINK before including
it in the set to be fetched.
(As stated in the commit message of that commit, unpack-trees was also
taught a similar thing prior, but unpack-trees correctly checks for
GITLINK before including objects in the set to be fetched.)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the two separate patch-id implementations to use the_hash_algo
in their implementation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The line count in the outer diff's hunk headers of a range diff is not
all that interesting. It merely shows how far along the inner diff
are on both sides. That number is of no use for human readers, and
range-diffs are not meant to be machine readable.
In a subsequent commit we're going to add some more contextual
information such as the filename corresponding to the diff to the hunk
headers. Remove the unnecessary information, and just keep the "@@"
to indicate that a new hunk of the outer diff is starting.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running an external diff from, say, a diff tool, it is safe to
assume that we want to write the files in question. On Windows, that
means that there cannot be any other process holding an open handle to
said files, or even just a mapped region.
So let's make sure that `git diff` itself is not holding any open handle
to the files in question.
In fact, we will just release the file pair right away, as the external
diff uses the files we just wrote, so we do not need to hold the file
contents in memory anymore.
This fixes https://github.com/git-for-windows/git/issues/1315
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using the repository_format_partial_clone global
and fetch_objects() directly, let's use has_promisor_remote()
and promisor_remote_get_direct().
This way all the configured promisor remotes will be taken
into account, not only the one specified by
extensions.partialClone.
Also when cloning or fetching using a partial clone filter,
remote.origin.promisor will be set to "true" instead of
setting extensions.partialClone to "origin". This makes it
possible to use many promisor remote just by fetching from
them.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--base" option of "format-patch" computed the patch-ids for
prerequisite patches in an unstable way, which has been updated to
compute in a way that is compatible with "git patch-id --stable".
* sb/format-patch-base-patch-id-fix:
format-patch: make --base patch-id output stable
format-patch: inform user that patch-id generation is unstable
Fix two typos introduced by the following commits:
+ 31fba9d3b4 (diff-parseopt: convert --[src|dst]-prefix, 2019-03-24)
+ ed8b4132c8 (remote-curl: mark all error messages for translation,
2019-03-05)
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A brown-paper-bag bugfix to a change already in 'master'.
* nd/diff-parseopt:
parse-options: check empty value in OPT_INTEGER and OPT_ABBREV
diff-parseopt: restore -U (no argument) behavior
diff-parseopt: correct variable types that are used by parseopt
Before d473e2e0e8 (diff.c: convert -U|--unified, 2019-01-27), -U and
--unified are implemented with a custom parser opt_arg() in diff.c. I
didn't check this code carefully and not realize that it's the
equivalent of PARSE_OPT_NONEG | PARSE_OPT_OPTARG.
In other words, if -U is specified without any argument, the option
should be accepted, and the default value should be used. Without
PARSE_OPT_OPTARG, parse_options() will reject this case and cause a
regression.
Reported-by: Bryan Turner <bturner@atlassian.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We weren't flushing the context each time we processed a hunk in the
patch-id generation code in diff.c, but we were doing that when we
generated "stable" patch-ids with the 'patch-id' tool. Let's port that
similar logic over from patch-id.c into diff.c so we can get the same
hash when we're generating patch-ids for 'format-patch --base=' types of
command invocations.
Cc: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While running "git diff" in a lazy clone, we can upfront know which
missing blobs we will need, instead of waiting for the on-demand
machinery to discover them one by one. Aim to achieve better
performance by batching the request for these promised blobs.
* jt/batch-fetch-blobs-in-diff:
diff: batch fetching of missing blobs
sha1-file: support OBJECT_INFO_FOR_PREFETCH
When running a command like "git show" or "git diff" in a partial clone,
batch all missing blobs to be fetched as one request.
This is similar to c0c578b33c ("unpack-trees: batch fetching of missing
blobs", 2017-12-08), but for another command.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff_opt_parse() is a heavy hammer to just set diff filter. But it's
the only way because of the diff_status_letters[] mapping. Add a new
API to set diff filter and use it in git-am. diff_opt_parse()'s only
remaining call site in revision.c will be gone soon and having it here
just because of git-am does not make sense.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option is added in commit b73bcbac4a (diff: allow
--no-color-moved-ws - 2018-11-23) in pw/diff-color-moved-ws-fix. To ease
merge conflict resolution, re-implement the option handling here so that
the conflict could be resolved by taking this side of change.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark one more string for translation while at there
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OPT__ABBREV() has the same behavior as the deleted code with one
difference: it does check for valid number and error out if not. And the
'40' change is self explanatory.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at it, mark one more string for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at it, mark one more string for translation.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark one more string for translation while at there.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command line completion (in contrib/) has been taught to
complete more subcommand parameters.
* nd/completion-more-parameters:
completion: add more parameter value completion
Code clean-up.
* jk/unused-params:
ref-filter: drop unused "sz" parameters
ref-filter: drop unused "obj" parameters
ref-filter: drop unused buf/sz pairs
files-backend: drop refs parameter from split_symref_update()
pack-objects: drop unused parameter from oe_map_new_pack()
merge-recursive: drop several unused parameters
diff: drop complete_rewrite parameter from run_external_diff()
diff: drop unused emit data parameter from sane_truncate_line()
diff: drop unused color reset parameters
diff: drop options parameter from diffcore_fix_diff_index()
The diff machinery, one of the oldest parts of the system, which
long predates the parse-options API, uses fairly long and complex
handcrafted option parser. This is being rewritten to use the
parse-options API.
* nd/diff-parseopt:
diff.c: convert --raw
diff.c: convert -W|--[no-]function-context
diff.c: convert -U|--unified
diff.c: convert -u|-p|--patch
diff.c: prepare to use parse_options() for parsing
diff.h: avoid bit fields in struct diff_flags
diff.h: keep forward struct declarations sorted
parse-options: allow ll_callback with OPTION_CALLBACK
parse-options: avoid magic return codes
parse-options: stop abusing 'callback' for lowlevel callbacks
parse-options: add OPT_BITOP()
parse-options: disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
parse-options: add one-shot mode
parse-options.h: remove extern on function prototypes
For --rename-empty, see 90d43b0768 (teach diffcore-rename to
optionally ignore empty content - 2012-03-22) for more information.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--no-find-copies-harder is also added on purpose (because I don't see
why we should not have the --no- version for this)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also validates that the user specifies a single character in
--output-indicator-*, not a string.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds value completion for a couple more paramters. To make it
easier to maintain these hard coded lists, add a comment at the original
list/code to remind people to update git-completion.bash too.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our builtin_diff() wants to know whether break-detection found a
complete rewrite, because it changes how the diff is shown. However,
when calling out to an external diff, we don't pass this information
along (and doing so would require designing a new interface to the
user-provided program).
Let's drop the unused parameter to make this fact more clear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We pass the "struct emit_callback" (which contains all of the context
for our diff) into sane_truncate_line(), but that function doesn't
actually use it. In theory we might eventually develop a diff option
that impacts this, but in the meantime let's not mislead anybody reading
the code. Since the function is static, it would be easy to pass it
again if it should ever become useful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several of the emit_* functions take a "reset" color parameter, but
never actually look at it (instead, they call into emit_diff_symbol,
which handles the colors itself). Let's drop these unused parameters.
Note that emit_line() does still take a color/reset pair, and actually
uses it. It cannot be refactored to match these other functions because
it's the thing that emit_diff_symbol eventually calls into (i.e., it
does not by itself know which colors to use, and must be told by the
caller).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sole purpose of this function is to fix the sorting order of the
queued diff entries. It doesn't need to know about any diff options, so
we can drop the unused parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --color-moved --cc --stat -p" did not work well due to
funny interaction between a bug in color-moved and the rest, which
has been fixed.
* jk/diff-cc-stat-fixes:
combine-diff: treat --dirstat like --stat
combine-diff: treat --summary like --stat
combine-diff: treat --shortstat like --stat
combine-diff: factor out stat-format mask
diff: clear emitted_symbols flag after use
t4006: resurrect commented-out tests
There were many places the code relied on the string returned from
getenv() to be non-volatile, which is not true, that have been
corrected.
* jk/save-getenv-result:
builtin_diff(): read $GIT_DIFF_OPTS closer to use
merge-recursive: copy $GITHEAD strings
init: make a copy of $GIT_DIR string
config: make a copy of $GIT_CONFIG string
commit: copy saved getenv() result
get_super_prefix(): copy getenv() result
The code to drive GIT_EXTERNAL_DIFF command relied on the string
returned from getenv() to be non-volatile, which is not true, that
has been corrected.
* kg/external-diff-save-env:
diff: ensure correct lifetime of external_diff_cmd
This is a preparation step to start using parse_options() to parse
diff/revision options instead of what we have now. There are a couple
of good things from using parse_options():
- better help usage
- easier to add new options
- better completion support
- help usage generation
- better integration with main command option parser. We can just
concat the main command's option array and diffopt's together and
parse all in one go.
- detect colidding options (e.g. --reverse is used by revision code,
so diff code can't use it as long name for -R)
- consistent syntax, e.g. option that takes mandatory argument will
now accept both "--option=value" and "--option value".
The plan is migrate all diff/rev options to parse_options(). Then we
could get rid of diff_opt_parse() and expose parseopts[] directly to
the caller.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's an odd bug when "log --color-moved" is used with the combination
of "--cc --stat -p": the stat for merge commits is erroneously shown
with the diff of the _next_ commit.
The included test demonstrates the issue. Our history looks something
like this:
A-B-M--D
\ /
C
When we run "git log --cc --stat -p --color-moved" starting at D, we get
this sequence of events:
1. The diff for D is using -p, so diff_flush() calls into
diff_flush_patch_all_file_pairs(). There we see that o->color_moved
is in effect, so we point o->emitted_symbols to a static local
struct, causing diff_flush_patch() to queue the symbols instead of
actually writing them out.
We then do our move detection, emit the symbols, and clear the
struct. But we leave o->emitted_symbols pointing to our struct.
2. Next we compute the diff for M. This is a merge, so we use the
combined diff code. In find_paths_generic(), we compute the
pairwise diff between each commit and its parent. Normally this is
done with DIFF_FORMAT_NO_OUTPUT, since we're just looking for
intersecting paths. But since "--stat --cc" shows the first-parent
stat, and since we're computing that diff anyway, we enable
DIFF_FORMAT_DIFFSTAT for the first parent. This outputs the stat
information immediately, saving us from running a separate
first-parent diff later.
But where does that output go? Normally it goes directly to stdout,
but because o->emitted_symbols is set, we queue it. As a result, we
don't actually print the diffstat for the merge commit (yet), which
is wrong.
3. Next we compute the diff for C. We're actually showing a patch
again, so we end up in diff_flush_patch_all_file_pairs(), but this
time we have the queued stat from step 2 waiting in our struct.
We add new elements to it for C's diff, and then flush the whole
thing. And we see the diffstat from M as part of C's diff, which is
wrong.
So triggering the bug really does require the combination of all of
those options.
To fix it, we can simply restore o->emitted_symbols to NULL after
flushing it, so that it does not affect anything outside of
diff_flush_patch_all_file_pairs(). This intuitively makes sense, since
nobody outside of that function is going to bother flushing it, so we
would not want them to write to it either.
In fact, we could take this a step further and turn the local "esm"
struct into a non-static variable that goes away after the function
ends. However, since it contains a dynamically sized array, we benefit
from amortizing the cost of allocations over many calls. So we'll leave
it as static to retain that benefit.
But let's push the zero-ing of esm.nr into the conditional for "if
(o->emitted_symbols)" to make it clear that we do not expect esm to hold
any values if we did not just try to use it. With the code as it is
written now, if we did encounter such a case (which I think would be a
bug), we'd silently leak those values without even bothering to display
them. With this change, we'd at least eventually show them, and somebody
would notice.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The value returned by getenv() is not guaranteed to remain valid across
other environment function calls. But in between our call and using the
value, we run fill_textconv(), which may do quite a bit of work,
including spawning sub-processes.
We can make this safer by calling getenv() right before we actually look
at its value.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to getenv(3)'s notes:
The implementation of getenv() is not required to be reentrant. The
string pointed to by the return value of getenv() may be statically
allocated, and can be modified by a subsequent call to getenv(),
putenv(3), setenv(3), or unsetenv(3).
Since strings returned by getenv() are allowed to change on subsequent
calls to getenv(), make sure to duplicate when caching external_diff_cmd
from environment.
This problem becomes apparent on Git for Windows since fe21c6b285
(mingw: reencode environment variables on the fly (UTF-16 <-> UTF-8)),
when the getenv() implementation provided in compat/mingw.c was changed
to keep a certain amount of alloc'ed strings and freeing them on
subsequent calls.
This fixes https://github.com/git-for-windows/git/issues/2007:
$ yes n | git -c difftool.prompt=yes difftool fe21c6b285 fe21c6b285df~100
Viewing (1/404): '.gitignore'
Launch 'bc3' [Y/n]?
Viewing (2/404): 'Documentation/.gitignore'
Launch 'bc3' [Y/n]?
Viewing (3/404): 'Documentation/Makefile'
Launch 'bc3' [Y/n]?
Viewing (4/404): 'Documentation/RelNotes/2.14.5.txt'
Launch 'bc3' [Y/n]?
Viewing (5/404): 'Documentation/RelNotes/2.15.3.txt'
Launch 'bc3' [Y/n]?
Viewing (6/404): 'Documentation/RelNotes/2.16.5.txt'
Launch 'bc3' [Y/n]?
Viewing (7/404): 'Documentation/RelNotes/2.17.2.txt'
Launch 'bc3' [Y/n]?
Viewing (8/404): 'Documentation/RelNotes/2.18.1.txt'
Launch 'bc3' [Y/n]?
Viewing (9/404): 'Documentation/RelNotes/2.19.0.txt'
Launch 'bc3' [Y/n]? error: cannot spawn ¦?: No such file or directory
fatal: external diff died, stopping at Documentation/RelNotes/2.19.1.txt
Signed-off-by: Kim Gybels <kgybels@infogroep.be>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using --color-moved-ws=allow-indentation-change allow lines with
the same indentation change to be grouped across blank lines. For now
this only works if the blank lines have been moved as well, not for
blocks that have just had their indentation changed.
This completes the changes to the implementation of
--color-moved=allow-indentation-change. Running
git diff --color-moved=allow-indentation-change v2.18.0 v2.19.0
now takes 5.0s. This is a saving of 41% from 8.5s for the optimized
version of the previous implementation and 66% from the original which
took 14.6s.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently diff --color-moved-ws=allow-indentation-change does not
support indentation that contains a mix of tabs and spaces. For
example in commit 546f70f377 ("convert.h: drop 'extern' from function
declaration", 2018-06-30) the function parameters in the following
lines are not colored as moved [1].
-extern int stream_filter(struct stream_filter *,
- const char *input, size_t *isize_p,
- char *output, size_t *osize_p);
+int stream_filter(struct stream_filter *,
+ const char *input, size_t *isize_p,
+ char *output, size_t *osize_p);
This commit changes the way the indentation is handled to track the
visual size of the indentation rather than the characters in the
indentation. This has the benefit that any whitespace errors do not
interfer with the move detection (the whitespace errors will still be
highlighted according to --ws-error-highlight). During the discussion
of this feature there were concerns about the correct detection of
indentation for python. However those concerns apply whether or not
we're detecting moved lines so no attempt is made to determine if the
indentation is 'pythonic'.
[1] Note that before the commit to fix the erroneous coloring of moved
lines each line was colored as a different block, since that commit
they are uncolored.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running
git diff --color-moved-ws=allow-indentation-change v2.18.0 v2.19.0
cmp_in_block_with_wsd() is called 694908327 times. Of those 42.7%
return after comparing a and b. By comparing the lengths first we can
return early in all but 0.03% of those cases without dereferencing the
string pointers. The comparison between a and c fails in 6.8% of
calls, by comparing the lengths first we reject all the failing calls
without dereferencing the string pointers.
This reduces the time to run the command above by by 42% from 14.6s to
8.5s. This is still much slower than the normal --color-moved which
takes ~0.6-0.7s to run but is a significant improvement.
The next commits will replace the current implementation with one that
works with mixed tabs and spaces in the indentation. I think it is
worth optimizing the current implementation first to enable a fair
comparison between the two implementations.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently when using --color-moved=zebra the color of moved blocks
depends on the number of lines separating them. This means that adding
an odd number of unmoved lines between blocks that are already separated
by one or more unmoved lines will change the color of subsequent moved
blocks. This does not make much sense as the blocks were already
separated by unmoved lines and causes problems when adding lines to test
cases.
Fix this by only using the alternate colors for adjacent moved blocks.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'diff --color-moved-ws=allow-indentation-change' can color lines as
moved when they are in fact different. For example in commit
1a07e59c3e ("Update messages in preparation for i18n", 2018-07-21) the
lines
- die (_("must end with a color"));
+ die(_("must end with a color"));
are colored as moved even though they are different.
This is because if there is a fuzzy match for the first line of
a potential moved block the line is marked as moved before the
potential match is checked to see if it actually matches. The fix is
to delay marking the line as moved until after we have checked that
there really is at least one matching potential moved block.
Note that the test modified in the last commit still fails because
adding an unmoved line between two moved blocks that are already
separated by unmoved lines changes the color of the block following the
addition. This should not be the case and will be fixed in the next
commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow --no-color-moved-ws and --color-moved-ws=no to cancel any previous
--color-moved-ws option.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the messages and documentation use 'whitespace' rather than
'white space' or 'white spaces' convert to latter two to the former for
consistency.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under certain circumstances, "git diff D:/a/b/c D:/a/b/d" on
Windows would strip initial parts from the paths because they
were not recognized as absolute, which has been corrected.
* js/diff-notice-has-drive-prefix:
diff: don't attempt to strip prefix from absolute Windows paths
Code preparation to replace ulong vars with size_t vars where
appropriate.
* tb/print-size-t-with-uintmax-format:
Upcast size_t variables to uintmax_t when printing
This changes the error handling for the options --color-moved-ws
and --color-moved-ws to be like the rest of the options.
Move the die() call out of parse_color_moved_ws into the parsing
of command line options. As the function returns a bit field, change
its signature to return an unsigned instead of an int; add a new bit
to signal errors. Once the error is signaled, we discard the other
bits, such that it doesn't matter if the error bit overlaps with any
other bit.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interface into "xdiff" library used to discover the offset and
size of a generated patch hunk by first formatting it into the
textual hunk header "@@ -n,m +k,l @@" and then parsing the numbers
out. A new interface has been introduced to allow callers a more
direct access to them.
* jk/xdiff-interface:
xdiff-interface: drop parse_hunk_header()
range-diff: use a hunk callback
diff: convert --check to use a hunk callback
combine-diff: use an xdiff hunk callback
diff: use hunk callback for word-diff
diff: discard hunk headers for patch-ids earlier
diff: avoid generating unused hunk header lines
xdiff-interface: provide a separate consume callback for hunks
xdiff: provide a separate emit callback for hunks
When printing variables which contain a size, today "unsigned long"
is used at many places.
In order to be able to change the type from "unsigned long" into size_t
some day in the future, we need to have a way to print 64 bit variables
on a system that has "unsigned long" defined to be 32 bit, like Win64.
Upcast all those variables into uintmax_t before they are printed.
This is to prepare for a bigger change, when "unsigned long"
will be converted into size_t for variables which may be > 4Gib.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "diff --check" code needs to know the line number on which each hunk
starts in order to generate its output. We get that now by parsing the
hunk header line generated by xdiff, but it's much simpler to just pass
it directly using a hunk callback.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our word-diff does not look at the -/+ lines generated by xdiff at all
(because they are not real lines to show the user, but just the
tokenized words split into lines). Instead we use the line numbers from
the hunk headers to index our own data structure.
As a result, our xdi_diff_outf() callback throws away all lines except
hunk headers. We can instead use a hunk callback, which has two
benefits:
1. We don't have to re-parse the generated hunk header line, but can
use the passed parameters directly.
2. By setting our line callback to NULL, we can tell xdiff-interface
that it does not even need to bother generating the other lines,
saving a small amount of work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not include hunk header lines when computing patch-ids, since
the line numbers would create false negatives. Rather than detect and
skip them in our line callback, we can simply tell xdiff to avoid
generating them.
This is similar to the previous commit, but split out because it
actually requires modifying the matching line callback.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some callers of xdi_diff_outf() do not look at the generated hunk header
lines at all. By plugging in a no-op hunk callback, this tells xdiff not
to even bother formatting them.
This patch introduces a stock no-op callback and uses it with a few
callers whose line callbacks explicitly ignore hunk headers (because
they look only for +/- lines).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit taught xdiff to optionally provide the hunk header
data to a specialized callback. But most users of xdiff actually use our
more convenient xdi_diff_outf() helper, which ensures that our callbacks
are always fed whole lines.
Let's plumb the special hunk-callback through this interface, too. It
will follow the same rule as xdiff when the hunk callback is NULL (i.e.,
continue to pass a stringified hunk header to the line callback). Since
we add NULL to each caller, there should be no behavior change yet.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under certain circumstances, "git diff D:/a/b/c D:/a/b/d" on
Windows would strip initial parts from the paths because they
were not recognized as absolute, which has been corrected.
* js/diff-notice-has-drive-prefix:
diff: don't attempt to strip prefix from absolute Windows paths
git diff can be invoked with absolute paths. Typically, this triggers
the --no-index case. Then the absolute paths remain in the file names
that are printed in the output.
There is one peculiarity, though: When the command is invoked from a
a sub-directory in a repository, then it is attempted to strip the
sub-directory from the beginning of relative paths. Yet, to detect a
relative path the code just checks for an initial forward slash.
This mistakes a Windows style path like "D:/base" as a relative path
and the output looks like this, for example:
D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff
1 1 ir/{base => diff}/1.txt
where the correct output should be
D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff
1 1 D:/dir/{base => diff}/1.txt
If the sub-directory where 'git diff' is invoked is sufficiently deep
that the prefix becomes longer than the path to be printed, then the
subsequent code accesses the path out of bounds.
Use is_absolute_path() to detect Windows style absolute paths.
One might wonder whether the check for a directory separator that
is visible in the patch context should be changed from == '/' to
is_dir_sep() or not. It turns out not to be necessary. That code
only ever investigates paths that have undergone pathspec
normalization, after which there are only forward slashes even on
Windows.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".
* nd/the-index: (23 commits)
revision.c: reduce implicit dependency the_repository
revision.c: remove implicit dependency on the_index
ws.c: remove implicit dependency on the_index
tree-diff.c: remove implicit dependency on the_index
submodule.c: remove implicit dependency on the_index
line-range.c: remove implicit dependency on the_index
userdiff.c: remove implicit dependency on the_index
rerere.c: remove implicit dependency on the_index
sha1-file.c: remove implicit dependency on the_index
patch-ids.c: remove implicit dependency on the_index
merge.c: remove implicit dependency on the_index
merge-blobs.c: remove implicit dependency on the_index
ll-merge.c: remove implicit dependency on the_index
diff-lib.c: remove implicit dependency on the_index
read-cache.c: remove implicit dependency on the_index
diff.c: remove implicit dependency on the_index
grep.c: remove implicit dependency on the_index
diff.c: remove the_index dependency in textconv() functions
blame.c: rename "repo" argument to "r"
combine-diff.c: remove implicit dependency on the_index
...
Instead of passing the sign directly to emit_line_ws_markup, pass only the
index to lookup the sign in diff_options->output_indicators.
Signed-off-by: Stefan Beller <sbeller@google.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Free the hashmap items as well as the hashmap itself. This was found
with asan.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is obvious in retrospect, it was found with asan.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't duplicate the indentation string if we're not going to use it.
This was found with asan.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adjusting the start of the string to take account of the change
in indentation the code was not checking that the string being
adjusted was in fact longer than the indentation change. This was
detected by asan.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running
git diff --color-moved-ws=allow-indentation-change v2.18.0 v2.19.0
results in a crash due to a double free. This happens when two
potential moved blocks start with consecutive lines. As
pmb_advance_or_null_multi_match() advances it copies the ws_delta from
the last matching line to the next. When the first of our consecutive
lines is advanced its ws_delta well be copied to the second,
overwriting the ws_delta of the block containing the second line. Then
when the second line is advanced it will copy the new ws_delta to the
line below it and so on. Eventually one of these blocks will stop
matching and the ws_delta will be freed. From then on the other block
is in a use-after-free state and when it stops matching it will try to
free the ws_delta that has already been freed by the other block.
The solution is to store the ws_delta in the array of potential moved
blocks rather than with the lines. This means that it no longer needs
to be copied around and one block cannot overwrite the ws_delta of
another. Additionally it saves some malloc/free calls as we don't keep
allocating and freeing ws_deltas.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
[jc: squashed in missing forward decl in userdiff.h found by Ramsay]
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new variant repo_diff_setup() is added that takes 'struct repository *'
and diff_setup() becomes a thin macro around it that is protected by
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to NO_THE_INDEX_....
The plan is these macros will always be defined for all library files
and the macros are only accessible in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff and textconv code has so widespread use that it's hard to simply
update their api and all call sites at once because it would result in
a big patch. For now reduce the_index references to two places:
diff_setup() and fill_textconv().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
spatch transformation to replace boolean uses of !hashcmp() to
newly introduced oideq() is added, and applied, to regain
performance lost due to support of multiple hash algorithms.
* jk/cocci:
show_dirstat: simplify same-content check
read-cache: use oideq() in ce_compare functions
convert hashmap comparison functions to oideq()
convert "hashcmp() != 0" to "!hasheq()"
convert "oidcmp() != 0" to "!oideq()"
convert "hashcmp() == 0" to hasheq()
convert "oidcmp() == 0" to oideq()
introduce hasheq() and oideq()
coccinelle: use <...> for function exclusion
The color output support for recently introduced "range-diff"
command got tweaked a bit.
* sb/range-diff-colors:
range-diff: indent special lines as context
range-diff: make use of different output indicators
diff.c: add --output-indicator-{new, old, context}
diff.c: rewrite emit_line_0 more understandably
diff.c: omit check for line prefix in emit_line_0
diff: use emit_line_0 once per line
diff.c: add set_sign to emit_line_0
diff.c: reorder arguments for emit_line_ws_markup
diff.c: simplify caller of emit_line_0
t3206: add color test for range-diff --dual-color
test_decode_color: understand FAINT and ITALIC
If there is more than one potential moved block and the longest block
is not the first element of the array of potential blocks then the
block is cut short. With --color-moved=blocks this can leave moved
lines unpainted if the shortened block does not meet the block length
requirement. With --color-moved=zebra then in addition to the
unpainted lines the moved color can change in the middle of a single
block.
Fix this by freeing the whitespace delta of the match we're discarding
rather than the one we're keeping.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use two nested conditionals to store a content_changed
variable, but only bother to look at the result once,
directly after we set it. We can drop the variable entirely
and just use a single "if".
This needless complexity is the result of 2ff3a80334 (Teach
--dirstat not to completely ignore rearranged lines within a
file, 2011-04-11). Before that, we held onto the
content_changed variable much longer.
While we're touching the condition, we can swap out oidcmp()
for !oideq(). Our coccinelle patches didn't previously find
this case because of the intermediate variable, but now it's
a simple boolean in a conditional.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the flip side of the previous two patches: checking
for a non-zero oidcmp() can be more strictly expressed as
inequality. Like those patches, we write "!= 0" in the
coccinelle transformation, which covers by isomorphism the
more common:
if (oidcmp(E1, E2))
As with the previous two patches, this patch can be achieved
almost entirely by running "make coccicheck"; the only
differences are manual line-wrap fixes to match the original
code.
There is one thing to note for anybody replicating this,
though: coccinelle 1.0.4 seems to miss the case in
builtin/tag.c, even though it's basically the same as all
the others. Running with 1.0.7 does catch this, so
presumably it's just a coccinelle bug that was fixed in the
interim.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the more restrictive oideq() should, in the long run,
give the compiler more opportunities to optimize these
callsites. For now, this conversion should be a complete
noop with respect to the generated code.
The result is also perhaps a little more readable, as it
avoids the "zero is equal" idiom. Since it's so prevalent in
C, I think seasoned programmers tend not to even notice it
anymore, but it can sometimes make for awkward double
negations (e.g., we can drop a few !!oidcmp() instances
here).
This patch was generated almost entirely by the included
coccinelle patch. This mechanical conversion should be
completely safe, because we check explicitly for cases where
oidcmp() is compared to 0, which is what oideq() is doing
under the hood. Note that we don't have to catch "!oidcmp()"
separately; coccinelle's standard isomorphisms make sure the
two are treated equivalently.
I say "almost" because I did hand-edit the coccinelle output
to fix up a few style violations (it mostly keeps the
original formatting, but sometimes unwraps long lines).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will prove useful in range-diff in a later patch as we will be able to
differentiate between adding a new file (that line is starting with +++
and then the file name) and regular new lines.
It could also be useful for experimentation in new patch formats, i.e.
we could teach git to emit moved lines with lines other than +/-.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git tbdiff" that lets us compare individual patches in two
iterations of a topic has been rewritten and made into a built-in
command.
* js/range-diff: (21 commits)
range-diff: use dim/bold cues to improve dual color mode
range-diff: make --dual-color the default mode
range-diff: left-pad patch numbers
completion: support `git range-diff`
range-diff: populate the man page
range-diff --dual-color: skip white-space warnings
range-diff: offer to dual-color the diffs
diff: add an internal option to dual-color diffs of diffs
color: add the meta color GIT_COLOR_REVERSE
range-diff: use color for the commit pairs
range-diff: add tests
range-diff: do not show "function names" in hunk headers
range-diff: adjust the output of the commit pairs
range-diff: suppress the diff headers
range-diff: indent the diffs just like tbdiff
range-diff: right-trim commit messages
range-diff: also show the diff between patches
range-diff: improve the order of the shown commits
range-diff: first rudimentary implementation
Introduce `range-diff` to compare iterations of a topic branch
...
The more library-ish parts of the codebase learned to work on the
in-core index-state instance that is passed in by their callers,
instead of always working on the singleton "the_index" instance.
* nd/no-the-index: (24 commits)
blame.c: remove implicit dependency on the_index
apply.c: remove implicit dependency on the_index
apply.c: make init_apply_state() take a struct repository
apply.c: pass struct apply_state to more functions
resolve-undo.c: use the right index instead of the_index
archive-*.c: use the right repository
archive.c: avoid access to the_index
grep: use the right index instead of the_index
attr: remove index from git_attr_set_direction()
entry.c: use the right index instead of the_index
submodule.c: use the right index instead of the_index
pathspec.c: use the right index instead of the_index
unpack-trees: avoid the_index in verify_absent()
unpack-trees: convert clear_ce_flags* to avoid the_index
unpack-trees: don't shadow global var the_index
unpack-trees: add a note about path invalidation
unpack-trees: remove 'extern' on function declaration
ls-files: correct index argument to get_convert_attr_ascii()
preload-index.c: use the right index instead of the_index
dir.c: remove an implicit dependency on the_index in pathspec code
...
Many more strings are prepared for l10n.
* nd/i18n: (23 commits)
transport-helper.c: mark more strings for translation
transport.c: mark more strings for translation
sha1-file.c: mark more strings for translation
sequencer.c: mark more strings for translation
replace-object.c: mark more strings for translation
refspec.c: mark more strings for translation
refs.c: mark more strings for translation
pkt-line.c: mark more strings for translation
object.c: mark more strings for translation
exec-cmd.c: mark more strings for translation
environment.c: mark more strings for translation
dir.c: mark more strings for translation
convert.c: mark more strings for translation
connect.c: mark more strings for translation
config.c: mark more strings for translation
commit-graph.c: mark more strings for translation
builtin/replace.c: mark more strings for translation
builtin/pack-objects.c: mark more strings for translation
builtin/grep.c: mark strings for translation
builtin/config.c: mark more strings for translation
...
One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by
"dimmed-zebra".
* es/diff-color-moved-fix:
diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
Rewrite emit_line_0 to have fewer (nested) conditions.
The change in 'emit_line' makes sure that 'first' is never user data,
but always under our control, a sign or special character in the
beginning of the line (or 0, in which case we ignore it).
So from now on, let's pass only a diff marker or 0 as the 'first'
character of the line.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the previous patch made sure we only call emit_line_0 once per line,
we do not need the work around introduced in f7c3b4e2d8 (diff: add an
internal option to dual-color diffs of diffs, 2018-08-13) that would ensure
we'd emit 'diff_line_prefix(o)' just once per line.
By having just one call of emit_line_0 per line, the checks are dead code.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All lines that use emit_line_0 multiple times per line, are combined
into a single call to emit_line_0, making use of the 'set' argument.
We gain a little efficiency here, as we can omit emission of color and
accompanying reset if 'len == 0'.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split the meaning of the `set` parameter that is passed to
emit_line_0()` to separate between the color of the "sign" (i.e.
the diff marker '+', '-' or ' ' that is passed in as the `first`
parameter) and the color of the rest of the line.
This changes the meaning of the `set` parameter to no longer refer
to the color of the diff marker, but instead to refer to the color
of the rest of the line. A value of `NULL` indicates that the rest
of the line wants to be colored the same as the diff marker.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The order shall be all colors first, then the content, flags at the end.
The colors are in the order of occurrence, i.e. first the color for the
sign and then the color for the rest of the line.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Due to the previous condition we know "set_sign != NULL" at that point.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the convert API take an index_state instead of assuming the_index
in convert.c. All external call sites are converted blindly to keep
the patch simple and retain current behavior. Individual call sites
may receive further updates to use the right index instead of
the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code is only needed for diff-tree (since f0c6b2a2fd ([PATCH]
Optimize diff-tree -[CM] --stdin - 2005-05-27)). Let the caller do the
preparation instead and avoid read_index() in diff.c code.
read_index() should be avoided (in addition to the_index) because it
uses get_index_file() underneath to get the path $GIT_DIR/index. This
effectively pulls the_repository in and may become the only reason to
pull a 'struct repository *' in diff.c. Let's keep the dependencies as
few as possible and kick it back to diff-tree.c
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It *is* a confusing thing to look at a diff of diffs. All too easy is it
to mix up whether the -/+ markers refer to the "inner" or the "outer"
diff, i.e. whether a `+` indicates that a line was added by either the
old or the new diff (or both), or whether the new diff does something
different than the old diff.
To make things easier to process for normal developers, we introduced
the dual color mode which colors the lines according to the commit diff,
i.e. lines that are added by a commit (whether old, new, or both) are
colored in green. In non-dual color mode, the lines would be colored
according to the outer diff: if the old commit added a line, it would be
colored red (because that line addition is only present in the first
commit range that was specified on the command-line, i.e. the "old"
commit, but not in the second commit range, i.e. the "new" commit).
However, this dual color mode is still not making things clear enough,
as we are looking at two levels of diffs, and we still only pick a color
according to *one* of them (the outer diff marker is colored
differently, of course, but in particular with deep indentation, it is
easy to lose track of that outer diff marker's background color).
Therefore, let's add another dimension to the mix. Still use
green/red/normal according to the commit diffs, but now also dim the
lines that were only in the old commit, and use bold face for the lines
that are only in the new commit.
That way, it is much easier not to lose track of, say, when we are
looking at a line that was added in the previous iteration of a patch
series but the new iteration adds a slightly different version: the
obsolete change will be dimmed, the current version of the patch will be
bold.
At least this developer has a much easier time reading the range-diffs
that way.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When displaying a diff of diffs, it is possible that there is an outer
`+` before a context line. That happens when the context changed between
old and new commit. When that context line starts with a tab (after the
space that marks it as context line), our diff machinery spits out a
white-space error (space before tab), but in this case, that is
incorrect.
Rather than adding a specific whitespace flag that specifically ignores
the first space in the output (and might miss other problems with the
white-space warnings), let's just skip handling white-space errors in
dual color mode to begin with.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diffing diffs, it can be quite daunting to figure out what the heck
is going on, as there are nested +/- signs.
Let's make this easier by adding a flag in diff_options that allows
color-coding the outer diff sign with inverted colors, so that the
preimage and postimage is colored like the diff it is.
Of course, this really only makes sense when the preimage and postimage
*are* diffs. So let's not expose this flag via a command-line option for
now.
This is a feature that was invented by git-tbdiff, and it will be used
by `git range-diff` in the next commit, by offering it via a new option:
`--dual-color`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When showing the diff between corresponding patches of the two branch
versions, we have to make up a fake filename to run the diff machinery.
That filename does not carry any meaningful information, hence tbdiff
suppresses it. So we should, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --color-moved" feature has further been tweaked.
* sb/diff-color-move-more:
diff.c: offer config option to control ws handling in move detection
diff.c: add white space mode to move detection that allows indent changes
diff.c: factor advance_or_nullify out of mark_color_as_moved
diff.c: decouple white space treatment from move detection algorithm
diff.c: add a blocks mode for moved code detection
diff.c: adjust hash function signature to match hashmap expectation
diff.c: do not pass diff options as keydata to hashmap
t4015: avoid git as a pipe input
xdiff/xdiffi.c: remove unneeded function declarations
xdiff/xdiff.h: remove unused flags
Conversion from uchar[40] to struct object_id continues.
* bc/object-id:
pretty: switch hard-coded constants to the_hash_algo
sha1-file: convert constants to uses of the_hash_algo
log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz
diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
builtin/merge-recursive: make hash independent
builtin/merge: switch to use the_hash_algo
builtin/fmt-merge-msg: make hash independent
builtin/update-index: simplify parsing of cacheinfo
builtin/update-index: convert to using the_hash_algo
refs/files-backend: use the_hash_algo for writing refs
sha1-name: use the_hash_algo when parsing object names
strbuf: allocate space with GIT_MAX_HEXSZ
commit: express tree entry constants in terms of the_hash_algo
hex: switch to using the_hash_algo
tree-walk: replace hard-coded constants with the_hash_algo
cache: update object ID functions for the_hash_algo
The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many messages will be marked for translation in the following
commits. This commit updates some of them to be more consistent and
reduce diff noise in those commits. Changes are
- keep the first letter of die(), error() and warning() in lowercase
- no full stop in die(), error() or warning() if it's single sentence
messages
- indentation
- some messages are turned to BUG(), or prefixed with "BUG:" and will
not be marked for i18n
- some messages are improved to give more information
- some messages are broken down by sentence to be i18n friendly
(on the same token, combine multiple warning() into one big string)
- the trailing \n is converted to printf_ln if possible, or deleted
if not redundant
- errno_errno() is used instead of explicit strerror()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.
To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough. However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".
As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.
This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.
As this is a white space mode, it is made exclusive to other white space
modes in the move detection.
This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):
- A
- B
- C
+ A
+ B
+ C
At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.
Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:
- <TAB> A
...
+ A <TAB>
...
As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:
- A
- B
- A
- B
+ A
+ B
+ A
+ B
When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h
This moves the part of code that checks if we're still in a block
into its own function. We'll need a different approach on advancing
the blocks in a later patch, so having it as a separate function will
prove useful.
While at it rename the variable `p` to `prev` to indicate that it refers
to the previous line. This is as pmb[i] was assigned in the last iteration
of the outmost for loop.
Further rename `pnext` to `cur` to indicate that this should match up with
the current line of the outmost for loop.
Also replace the advancement of pmb[i] to reuse `cur` instead of
using `p->next` (which is how the name for pnext could be explained.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original implementation of the move detection logic the choice for
ignoring white space changes is the same for the move detection as it is
for the regular diff. Some cases came up where different treatment would
have been nice.
Allow the user to specify that white space should be ignored differently
during detection of moved lines than during generation of added and removed
lines. This is done by providing analogs to the --ignore-space-at-eol,
-b, and -w options by introducing the option --color-moved-ws=<modes>
with the modes named "ignore-space-at-eol", "ignore-space-change" and
"ignore-all-space", which is used only during the move detection phase.
As we change the default, we'll adjust the tests.
For now we do not infer any options to treat white spaces in the move
detection from the generic white space options given to diff.
This can be tuned later to reasonable default.
As we plan on adding more white space related options in a later patch,
that interferes with the current white space options, use a flag field
and clamp it down to XDF_WHITESPACE_FLAGS, as that (a) allows to easily
check at parse time if we give invalid combinations and (b) can reuse
parts of this patch.
By having the white space treatment in its own option, we'll also
make it easier for a later patch to have an config option for
spaces in the move detection.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new "blocks" mode provides a middle ground between plain and zebra.
It is as intuitive (few colors) as plain, but still has the requirement
for a minimum of lines/characters to count a block as moved.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
(https://public-inbox.org/git/87o9j0uljo.fsf@evledraar.gmail.com/)
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we initialize the hashmap, we give it a pointer to the
diff_options, which it then passes along to each call of the
hashmap_cmp_fn function. There's no need to pass it a second time as
the "keydata" parameter, and our comparison functions never look at
keydata.
This was a mistake left over from an earlier round of 2e2d5ac184
(diff.c: color moved lines differently, 2017-06-30), before hashmap
learned to pass the data pointer for us.
Explanation-by: Jeff King <peff@peff.net>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continuing with the idea to programatically enumerate various
pieces of data required for command line completion, teach the
codebase to report the list of configuration variables
subcommands care about to help complete them.
* nd/complete-config-vars:
completion: complete general config vars in two steps
log-tree: allow to customize 'grafted' color
completion: support case-insensitive config vars
completion: keep other config var completion in camelCase
completion: drop the hard coded list of config vars
am: move advice.amWorkDir parsing back to advice.c
advice: keep config name in camelCase in advice_config[]
fsck: produce camelCase config key names
help: add --config to list all available config
fsck: factor out msg_id_info[] lazy initialization code
grep: keep all colors in an array
Add and use generic name->id mapping code for color slot parsing
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (42 commits)
merge-one-file: compute empty blob object ID
add--interactive: compute the empty tree value
Update shell scripts to compute empty tree object ID
sha1_file: only expose empty object constants through git_hash_algo
dir: use the_hash_algo for empty blob object ID
sequencer: use the_hash_algo for empty tree object ID
cache-tree: use is_empty_tree_oid
sha1_file: convert cached object code to struct object_id
builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
submodule: convert several uses of EMPTY_TREE_SHA1_HEX
sequencer: convert one use of EMPTY_TREE_SHA1_HEX
merge: convert empty tree constant to the_hash_algo
builtin/merge: switch tree functions to use object_id
builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
sha1-file: add functions for hex empty tree and blob OIDs
builtin/receive-pack: avoid hard-coded constants for push certs
diff: specify abbreviation size in terms of the_hash_algo
upload-pack: replace use of several hard-coded constants
...
Developer support update, by using BUG() macro instead of die() to
mark codepaths that should not happen more clearly.
* js/use-bug-macro:
BUG_exit_code: fix sparse "symbol not declared" warning
Convert remaining die*(BUG) messages
Replace all die("BUG: ...") calls by BUG() ones
run-command: use BUG() to report bugs, not die()
test-tool: help verifying BUG() code paths
With merge.renames configuration set to false, the recursive merge
strategy can be told not to spend cycles trying to find renamed
paths and merge them accordingly.
* bp/merge-rename-config:
merge: pass aggressive when rename detection is turned off
merge: add merge.renames config setting
merge: update documentation for {merge,diff}.renameLimit
Sometimes it helps to list all available config vars so the user can
search for something they want. The config man page can also be used
but it's harder to search if you want to focus on the variable name,
for example.
This is not the best way to collect the available config since it's
not precise. Ideally we should have a centralized list of config in C
code (pretty much like 'struct option'), but that's a lot more work.
This will do for now.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard coding the name-to-id mapping in C code, keep it in an
array and use a common function to do the parsing. This reduces code
and also allows us to list all possible color slots later.
This starts using C99 designated initializers more for convenience
(the first designated initializers have been introduced in builtin/clean.c
for some time without complaints)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the ability to control rename detection for merge via a config setting.
This setting behaves the same and defaults to the value of diff.renames but only
applies to merge.
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).
The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.
Let's just convert all remaining ones in one fell swoop.
This trick was performed by this invocation:
sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>