Many test scripts use hash-object to create malformed objects to see how
we handle the results in various commands. In some cases we already have
to use "hash-object --literally", because it does some rudimentary
quality checks. But let's use "--literally" more consistently to
future-proof these tests against hash-object learning to be more
careful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a duplicate entry check on rename_src in order to
avoid segfaults; the code at the time was prone to double free()s and an
easy way to avoid it was just to turn off rename detection for any
duplicate entries. Note that the form of the check was modified two
commits ago in this series.
Similarly, commit 4d6be03b95 ("diffcore-rename: avoid processing
duplicate destinations", 2015-02-26) added a duplicate entry check
on rename_dst for the exact same reason -- the code was prone to double
free()s, and an easy way to avoid it was just to turn off rename
detection entirely. Note that the form of the check was modified in the
commit just before this one.
In the original code in both places, the code was dealing with
individual diff_filespecs and trying to match things up, instead of just
keeping the original diff_filepairs around as we do now. The
intervening change in structure has fixed the accounting problems and
the associated double free()s that used to occur, and thus we already
have a better fix. As such, we can remove the band-aid checks for
duplicate entries.
Due to the last two patches, the diffcore_rename() setup is no longer a
sizeable chunk of overall runtime. Thus, in a large rebase of many
commits with lots of renames and several optimizations to inexact rename
detection, this patch only speeds up the overall code by about half a
percent or so and is pretty close to the run-to-run variability making
it hard to get an exact measurement. However, with some trace2 regions
around the setup code in diffcore_rename() so that I can focus on just
it, I measure that this patch consistently saves almost a third of the
remaining time spent in diffcore_rename() setup.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While creating the last commit, I found a number of other cases where
git would segfault when faced with trees that have duplicate entries.
None of these segfaults are in the diffcore-rename code (they all occur
in cache-tree and unpack-trees). Further, to my knowledge, no one has
ever been adversely affected by these bugs, and given that it has been
15 years and folks have fixed a few other issues with historical
duplicate entries (as noted in the last commit), I am not sure we will
ever run into anyone having problems with these. So I am not sure these
are worth fixing, but it doesn't hurt to at least document these
failures in the same test file that is concerned with duplicate tree
entries.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 4d6be03b95 ("diffcore-rename: avoid processing duplicate
destinations", 2015-02-26) added t4058 to demonstrate that a workaround
it added to avoid double frees (namely to just turn off rename detection
when trees had duplicate entries) would indeed avoid segfaults. The
tests, though, give the impression that the expected diffs are "correct"
when in reality they are just "don't segfault, and do something
semi-reasonable under the circumstances". Add some notes to make this
clearer.
Also, commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a similar workaround to avoid segfaults, but for
rename_src rather than rename_dst. I do not see any tests in the
testsuite to cover the collision detection of entries limited to the
source side, so add a couple.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch all uses of $_z40 to $ZERO_OID so that they work correctly with
larger hashes. This commit was created by using the following sed
command to modify all files in the t directory except t/test-lib.sh:
sed -i 's/\$_z40/$ZERO_OID/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rename code cannot handle an input where we have
duplicate destinations (i.e., more than one diff_filepair in
the queue with the same string in its pair->two->path). We
end up allocating only one slot in the rename_dst mapping.
If we fill in the diff_filepair for that slot, when we
re-queue the results, we may queue that filepair multiple
times. When the diff is finally flushed, the filepair is
processed and free()d multiple times, leading to heap
corruption.
This situation should only happen when a tree diff sees
duplicates in one of the trees (see the added test for a
detailed example). Rather than handle it, the sanest thing
is just to turn off rename detection altogether for the
diff.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>