I ran git-prune on a repository and got this:
$ git-prune
error: Object 228f8065b930120e35fc0c154c237487ab02d64a is a blob, not a commit
Segmentation fault (core dumped)
This repository was a strange one in that it was being used to provide
its own submodule. That is, the repository was cloned into a
subdirectory, an independent branch checked out in that subdirectory,
and then it was marked as a submodule. git-prune then failed in the
above manner.
The problem was that git-prune was not submodule aware in two areas.
Linus said:
> So what happens is that something traverses a tree object, looks at each
> entry, sees that it's not a tree, and tries to look it up as a blob. But
> subprojects are commits, not blobs, and then when you look at the object
> more closely, you get the above kind of object type confusion.
and included a patch to add an S_ISGITLINK() test to reachable.c's
process_tree() function. That fixed the first git-prune error, and
stopped it from trying to process the gitlink entries in trees as if
they were pointers to other trees (and of course failing, because
gitlinks _aren't_ trees). That part of this patch is his.
The second area is add_cache_refs(). This is called before starting the
reachability analysis, and was calling lookup_blob() on every object
hash found in the index. However, it is no longer true that every hash
in the index is a pointer to a blob, some of them are gitlinks, and are
not backed by any object at all, they are commits in another repository.
Normally this bug was not causing any problems, but in the case of the
self-referencing repository described above, it meant that the gitlink
hash was being marked as being of type OBJ_BLOB by add_cache_refs() call
to lookup_blob(). Then later, because that hash was also pointed to by
a ref, add_one_ref() would treat it as a commit; lookup_commit() would
return a NULL because that object was already noted as being an
OBJ_BLOB, not an OBJ_COMMIT; and parse_commit_buffer() would SEGFAULT on
that NULL pointer.
The fix made by this patch is to not blindly call lookup_blob() in
reachable.c's add_cache_refs(), and instead skip any index entries that
are S_ISGITLINK().
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This removes slightly more lines than it adds, but the real reason for
doing this is that future optimizations will require more setup of the
tree descriptor, and so we want to do it in one place.
Also renamed the "desc.buf" field to "desc.buffer" just to trigger
compiler errors for old-style manual initializations, making sure I
didn't miss anything.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently, the search for all reflogs depends on the existence of
corresponding refs under the .git/refs/ directory. Let's scan the
.git/logs/ directory directly instead.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It used to ignore the return value of the helper function; now, it
expects it to return 0, and stops iteration upon non-zero return
values; this value is then passed on as the return value of
for_each_reflog_ent().
Further, it makes no sense to force the parsing upon the helper
functions; for_each_reflog_ent() now calls the helper function with
old and new sha1, the email, the timestamp & timezone, and the message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This moves major part of builtin-prune into a separate file,
reachable.c. It is used to mark the objects that are reachable
from refs, and optionally from reflogs.
The patch looks very large, but if you look at it with diff -C,
which this message is formatted in, most of them are copied
lines and there are very little additions.
Signed-off-by: Junio C Hamano <junkio@cox.net>