rm: only refresh entries that we may touch

This gets rid of the whole tree cache refresh. Instead only path that
we touch will get refreshed. We may still lstat() more than needed,
but it'd be better playing safe.

This potentially reduces a large number of lstat() on big trees. Take
gentoo-x86 tree for example, which has roughly 80k files:

Unmodified Git:

$ time git rm --cached skel.ebuild
rm 'skel.ebuild'

real    0m1.441s
user    0m0.821s
sys     0m0.531s

Modified Git:

$ time ~/w/git/git rm --cached skel.ebuild
rm 'skel.ebuild'

real    0m0.941s
user    0m0.828s
sys     0m0.091s

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Nguyễn Thái Ngọc Duy 2010-01-17 15:43:13 +07:00 committed by Junio C Hamano
parent 688cd6d2b9
commit 4e1a7baa2e

View File

@ -169,9 +169,10 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
if (read_cache() < 0) if (read_cache() < 0)
die("index file corrupt"); die("index file corrupt");
refresh_cache(REFRESH_QUIET);
pathspec = get_pathspec(prefix, argv); pathspec = get_pathspec(prefix, argv);
refresh_index(&the_index, REFRESH_QUIET, pathspec, NULL, NULL);
seen = NULL; seen = NULL;
for (i = 0; pathspec[i] ; i++) for (i = 0; pathspec[i] ; i++)
/* nothing */; /* nothing */;