Fix diff -B/--dirstat miscounting of newly added contents

What used to happen is that diffcore_count_changes() simply ignored any
hashes in the destination that didn't match hashes in the source. EXCEPT
if the source hash didn't exist at all, in which case it would count _one_
destination hash that happened to have the "next" hash value.  As a
consequence, newly added material was often undercounted, making output
from --dirstat and "complete rewrite" detection used by -B unrelialble.

This changes it so that:

 - whenever it bypasses a destination hash (because it doesn't match a
   source), it counts the bytes associated with that as "literal added"

 - at the end (once we have used up all the source hashes), we do the same
   thing with the remaining destination hashes.

 - when hashes do match, and we use the difference in counts as a value,
   we also use up that destination hash entry (the 'd++').

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Linus Torvalds 2009-12-04 12:07:47 -08:00 committed by Junio C Hamano
parent 952dfc6944
commit 77cd6ab621

View File

@ -201,10 +201,15 @@ int diffcore_count_changes(struct diff_filespec *src,
while (d->cnt) { while (d->cnt) {
if (d->hashval >= s->hashval) if (d->hashval >= s->hashval)
break; break;
la += d->cnt;
d++; d++;
} }
src_cnt = s->cnt; src_cnt = s->cnt;
dst_cnt = d->hashval == s->hashval ? d->cnt : 0; dst_cnt = 0;
if (d->cnt && d->hashval == s->hashval) {
dst_cnt = d->cnt;
d++;
}
if (src_cnt < dst_cnt) { if (src_cnt < dst_cnt) {
la += dst_cnt - src_cnt; la += dst_cnt - src_cnt;
sc += src_cnt; sc += src_cnt;
@ -213,6 +218,10 @@ int diffcore_count_changes(struct diff_filespec *src,
sc += dst_cnt; sc += dst_cnt;
s++; s++;
} }
while (d->cnt) {
la += d->cnt;
d++;
}
if (!src_count_p) if (!src_count_p)
free(src_count); free(src_count);