From dc1daacdcc2268d9bd93e54672b73e7ea8557ba6 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 23 Jul 2021 05:29:34 -0400 Subject: [PATCH] pack-bitmap: check pack validity when opening bitmap When pack-objects adds an entry to its list of objects to pack, it may mark the packfile and offset that contains the file, which we can later use to output the object verbatim. If the packfile is deleted while we are running (e.g., by another process running "git repack"), we may die in use_pack() if the pack file cannot be opened. We worked around this in 4c08018204 (pack-objects: protect against disappearing packs, 2011-10-14) by making sure we can open the pack before recording it as a source. This detects a pack which has already disappeared while generating the packing list, and because we keep the pack's file descriptor (or an mmap window) open, it means we can access it later (unless you exceed core.packedgitlimit). The bitmap code that was added later does not do this; it adds entries to the packlist without checking that the packfile is still valid, and is vulnerable to this race. It needs the same treatment as 4c08018204. However, rather than add it in just that one spot, it makes more sense to simply open and check the packfile when we open the bitmap. Technically you can use the .bitmap without even looking in the .pack file (e.g., if you are just printing a list of objects without accessing them), but it's much simpler to do it early. That covers all later direct uses of the pack (due to the cached descriptor) without having to check each one directly. For example, in pack-objects we need to protect the packlist entries, but we also access the pack directly as part of the reuse_partial_pack_from_bitmap() feature. This patch covers both cases. There's no test here, because the problem is inherently racy. I reproduced and verified the fix with this script: rm -rf parent.git push.git fetch.git push() { ( cd push.git && echo content >>file && git add file && git commit -qm "change $1" && git push -q origin HEAD && echo "push $1..." ) && ( cd parent.git && git repack -ad -q && echo "repack $1..." ) } fetch() { rm -rf fetch.git && git clone -q file://$PWD/parent.git fetch.git && echo "fetch $1..." } git init --bare parent.git && git --git-dir=parent.git config transfer.unpacklimit 1 && git clone parent.git push.git && (for i in `seq 1 1000`; do push $i || break; done) & pusher=$! (for i in `seq 1 1000`; do fetch $i || break; done) & fetcher=$! wait $fetcher kill $pusher That simulates a race between a client cloning and a push triggering a repack on the server. Without this patch, it generally fails within a couple hundred iterations with: remote: fatal: packfile ./objects/pack/.tmp-1377349-pack-498afdec371232bdb99d1757872f5569331da61e.pack cannot be accessed error: git upload-pack: git-pack-objects died with error. fatal: git upload-pack: aborting due to possible repository corruption on the remote side. remote: aborting due to possible repository corruption on the remote side. fatal: early EOF fatal: fetch-pack: invalid index-pack output With this patch, it reliably runs through all thousand attempts. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- pack-bitmap.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index d90e1d9d8c..f9848e1b2a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -298,6 +298,11 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return -1; } + if (!is_pack_valid(packfile)) { + close(fd); + return -1; + } + bitmap_git->pack = packfile; bitmap_git->map_size = xsize_t(st.st_size); bitmap_git->map = xmmap(NULL, bitmap_git->map_size, PROT_READ, MAP_PRIVATE, fd, 0);