Commit Graph

18 Commits

Author SHA1 Message Date
Jeff King
12a58f9014 xdiff: drop unused mmfile parameters from xdl_do_patience_diff()
The entry point to the patience-diff algorithm takes two mmfile_t
structs with the original file contents, but it doesn't actually do
anything useful with them. This is similar to the case recently cleaned
up in the histogram code via f1d019071e (xdiff: drop unused mmfile
parameters from xdl_do_histogram_diff(), 2022-08-19), but there's a bit
more subtlety going on.

We pass them into the recursive patience_diff(), which in turn passes
them into fill_hashmap(), which stuffs the pointers into a struct. But
the only thing which reads the struct fields is our recursion into
patience_diff()!

So it's unlikely that something like -Wunused-parameter could find this
case: it would have to detect the circular dependency caused by the
recursion (not to mention tracing across struct field assignments).

But once found, it's easy to have the compiler confirm what's going on:

  1. Drop the "file1" and "file2" fields from the hashmap struct
     definition. Remove the assignments in fill_hashmap(), and
     temporarily substitute NULL in the recursive call to
     patience_diff(). Compiling shows that no other code touched those
     fields.

  2. Now fill_hashmap() will trigger -Wunused-parameter. Drop "file1"
     and "file2" from its definition and callsite.

  3. Now patience_diff() will trigger -Wunused-parameter. Drop them
     there, too. One of the callsites is the recursion with our
     NULL values, so those temporary values go away.

  4. Now xdl_do_patience_diff() will trigger -Wunused-parameter. Drop
     them there. And we're done.

Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-20 14:14:55 -07:00
Phillip Wood
848fd5ae5b xdiff: introduce XDL_CALLOC_ARRAY()
Add a helper for allocating an array and initialize the elements to
zero. This is analogous to CALLOC_ARRAY() in the rest of the codebase
but it returns NULL on allocation failures rather than dying to
accommodate other users of libxdiff such as libgit2.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08 09:34:30 -07:00
Phillip Wood
18aae7e21e xdiff: introduce xdl_calloc
In preparation for introducing XDL_CALLOC_ARRAY() use calloc() to
obtain zeroed out memory rather than malloc() followed by memset(). To
try and keep the lines a reasonable length this commit also stops
casting the pointer returned by calloc() as this is unnecessary.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08 09:34:30 -07:00
Phillip Wood
abf04bdaa8 xdiff: introduce XDL_ALLOC_ARRAY()
Add a helper to allocate an array that automatically calculates the
allocation size. This is analogous to ALLOC_ARRAY() in the rest of the
codebase but returns NULL if the allocation fails to accommodate other
users of libxdiff such as libgit2. The helper will also return NULL if
the multiplication in the allocation calculation overflows.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08 09:34:30 -07:00
Phillip Wood
61f883965f xdiff: handle allocation failure in patience diff
Other users of libxdiff such as libgit2 need to be able to handle
allocation failures. As NULL is a valid return value the function
signature is changed to be able report allocation failures.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 10:58:13 -08:00
Phillip Wood
9df0fc3d57 xdiff: fix a memory leak
Although the patience and histogram algorithms initialize the
environment they do not free it if there is an error. In contrast for
the Myers algorithm the environment is initalized in xdl_do_diff() and
it is freed if there is an error. Fix this by always initializing the
environment in xdl_do_diff() and freeing it there if there is an
error. Remove the comment in do_patience_diff() about the environment
being freed by xdl_diff() as it is not accurate because (a) xdl_diff()
does not do that if there is an error and (b) xdl_diff() is not the
only caller.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 10:58:05 -08:00
Junio C Hamano
daffa8961b Merge branch 'pw/patience-diff-clean-up'
Code clean-up.

* pw/patience-diff-clean-up:
  patience diff: remove unused variable
  patience diff: remove unnecessary string comparisons
2021-05-14 08:26:08 +09:00
Phillip Wood
f91371b948 patience diff: remove unused variable
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:56:48 +09:00
Phillip Wood
204aa2d24d patience diff: remove unnecessary string comparisons
xdl_prepare_env() calls xdl_classify_record() which arranges for the
hashes of non-matching lines to be different so lines can be tested
for equality by comparing just their hashes.

This reduces the time taken to calculate the diff of v2.28.0 to
v2.29.0 by ~3-4%.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 18:56:48 +09:00
Michał Kępień
ec7967cfaf merge-base, xdiff: zero out xpparam_t structures
xpparam_t structures are usually zero-initialized before their specific
fields are assigned to, but there are three locations in the tree where
that does not happen.  Add the missing memset() calls in order to make
initialization of xpparam_t structures consistent tree-wide and to
prevent stack garbage from being used as field values.

Signed-off-by: Michał Kępień <michal@isc.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20 12:53:26 -07:00
Carlo Marcelo Arenas Belón
81ed2b405c xdiff: remove duplicate headers from xpatience.c
92b7de93fb (Implement the patience diff algorithm, 2009-01-07) added them
but were already part of xinclude.h

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-28 21:51:22 -07:00
Jonathan Tan
2477ab2ea8 diff: support anchoring line(s)
Teach diff a new algorithm, one that attempts to prevent user-specified
lines from appearing as a deletion or addition in the end result. The
end user can use this by specifying "--anchored=<text>" one or more
times when using Git commands like "diff" and "show".

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-28 10:40:04 +09:00
Todd Zullinger
484257925f Replace Free Software Foundation address in license notices
The mailing address for the FSF has changed over the years.  Rather than
updating the address across all files, refer readers to gnu.org, as the
GNU GPL documentation now suggests for license notices.  The mailing
address is retained in the full license files (COPYING and LGPL-2.1).

The old address is still present in t/diff-lib/COPYING.  This is
intentional, as the file is used in tests and the contents are not
expected to change.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-09 13:21:21 +09:00
Derrick Stolee
19716b21a4 cleanup: fix possible overflow errors in binary search
A common mistake when writing binary search is to allow possible
integer overflow by using the simple average:

	mid = (min + max) / 2;

Instead, use the overflow-safe version:

	mid = min + (max - min) / 2;

This translation is safe since the operation occurs inside a loop
conditioned on "min < max". The included changes were found using
the following git grep:

	git grep '/ *2;' '*.c'

Making this cleanup will prevent future review friction when a new
binary search is contructed based on existing code.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-10 08:57:24 +09:00
Johannes Schindelin
044fb190f7 diff: fix a double off-by-one with --ignore-space-at-eol
When comparing two lines, ignoring any whitespace at the end, we first
try to match as many bytes as possible and break out of the loop only
upon mismatch, to let the remainder be handled by the code shared with
the other whitespace-ignoring code paths.

When comparing the bytes, however, we incremented the counters always,
even if the bytes did not match. And because we fall through to  the
space-at-eol handling at that point, it is as if that mismatch never
happened.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-11 11:55:53 -07:00
Junio C Hamano
307ab20b33 xdiff: PATIENCE/HISTOGRAM are not independent option bits
Because the default Myers, patience and histogram algorithms cannot be in
effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not
independent bits.  Instead of wasting one bit per algorithm, define a few
macros to access the few bits they occupy and update the code that access
them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-19 15:36:55 -08:00
Tay Ray Chuan
1d26b252f1 xdiff/xpatience: factor out fall-back-diff function
This is in preparation for the histogram diff algorithm, which will also
re-use much of the code to call the default Meyers diff algorithm.

Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-07 09:41:24 -07:00
Johannes Schindelin
92b7de93fb Implement the patience diff algorithm
The patience diff algorithm produces slightly more intuitive output
than the classic Myers algorithm, as it does not try to minimize the
number of +/- lines first, but tries to preserve the lines that are
unique.

To this end, it first determines lines that are unique in both files,
then the maximal sequence which preserves the order (relative to both
files) is extracted.

Starting from this initial set of common lines, the rest of the lines
is handled recursively, with Myers' algorithm as a fallback when
the patience algorithm fails (due to no common unique lines).

This patch includes memory leak fixes by Pierre Habouzit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-07 13:35:44 -08:00