git-commit-vandalism/t/t6500-gc.sh

156 lines
4.4 KiB
Bash
Raw Normal View History

#!/bin/sh
test_description='basic git gc tests
'
. ./test-lib.sh
test_expect_success 'setup' '
# do not let the amount of physical memory affects gc
# behavior, make sure we always pack everything to one pack by
# default
git config gc.bigPackThreshold 2g
'
test_expect_success 'gc empty repository' '
git gc
'
test_expect_success 'gc does not leave behind pid file' '
git gc &&
test_path_is_missing .git/gc.pid
'
test_expect_success 'gc --gobbledegook' '
test_expect_code 129 git gc --nonsense 2>err &&
test_i18ngrep "[Uu]sage: git gc" err
'
test_expect_success 'gc -h with invalid configuration' '
mkdir broken &&
(
cd broken &&
git init &&
echo "[gc] pruneexpire = CORRUPT" >>.git/config &&
test_expect_code 129 git gc -h >usage 2>&1
) &&
test_i18ngrep "[Uu]sage" broken/usage
'
test_expect_success 'gc is not aborted due to a stale symref' '
git init remote &&
(
cd remote &&
test_commit initial &&
git clone . ../client &&
git branch -m develop &&
cd ../client &&
git fetch --prune &&
git gc
)
'
test_expect_success 'gc --keep-largest-pack' '
test_create_repo keep-pack &&
(
cd keep-pack &&
test_commit one &&
test_commit two &&
test_commit three &&
git gc &&
( cd .git/objects/pack && ls *.pack ) >pack-list &&
test_line_count = 1 pack-list &&
BASE_PACK=.git/objects/pack/pack-*.pack &&
test_commit four &&
git repack -d &&
test_commit five &&
git repack -d &&
( cd .git/objects/pack && ls *.pack ) >pack-list &&
test_line_count = 3 pack-list &&
git gc --keep-largest-pack &&
( cd .git/objects/pack && ls *.pack ) >pack-list &&
test_line_count = 2 pack-list &&
test_path_is_file $BASE_PACK &&
git fsck
)
'
test_expect_success 'auto gc with too many loose objects does not attempt to create bitmaps' '
test_config gc.auto 3 &&
test_config gc.autodetach false &&
test_config pack.writebitmaps true &&
# We need to create two object whose sha1s start with 17
# since this is what git gc counts. As it happens, these
# two blobs will do so.
test_commit 263 &&
test_commit 410 &&
# Our first gc will create a pack; our second will create a second pack
git gc --auto &&
ls .git/objects/pack | sort >existing_packs &&
test_commit 523 &&
test_commit 790 &&
git gc --auto 2>err &&
test_i18ngrep ! "^warning:" err &&
ls .git/objects/pack/ | sort >post_packs &&
comm -1 -3 existing_packs post_packs >new &&
comm -2 -3 existing_packs post_packs >del &&
test_line_count = 0 del && # No packs are deleted
test_line_count = 2 new # There is one new pack and its .idx
'
t6500: wait for detached auto gc at the end of the test script The last test in 't6500-gc', 'background auto gc does not run if gc.log is present and recent but does if it is old', added in a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically trigger an error message from the test harness: rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty The test in question ends with executing an auto gc in the backround, which occasionally takes so long that it's still running when 'test_done' is about to remove the trash directory. This 'rm -rf $trash' in the foreground might race with the detached auto gc to create and delete files and directories, and gc might (re-)create a path that 'rm' already visited and removed, triggering the above error message when 'rm' attempts to remove its parent directory. Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) fixed the same problem in a different test script by simply disallowing background gc. Unfortunately, what worked there is not applicable here, because the purpose of this test is to check the behavior of a detached auto gc. Make sure that the test doesn't continue before the gc is finished in the background with a clever bit of shell trickery: - Open fd 9 in the shell, to be inherited by the background gc process, because our daemonize() only closes the standard fds 0, 1 and 2. - Duplicate this fd 9 to stdout. - Read 'git gc's stdout, and thus fd 9, through a command substitution. We don't actually care about gc's output, but this construct has two useful properties: - This read blocks until stdout or fd 9 are open. While stdout is closed after the main gc process creates the background process and exits, fd 9 remains open until the backround process exits. - The variable assignment from the command substitution gets its exit status from the command executed within the command substitution, i.e. a failing main gc process will cause the test to fail. Note, that this fd trickery doesn't work on Windows, because due to MSYS limitations the git process only inherits the standard fds 0, 1 and 2 from the shell. Luckily, it doesn't matter in this case, because on Windows daemonize() is basically a noop, thus 'git gc --auto' always runs in the foreground. And since we can now continue the test reliably after the detached gc finished, check that there is only a single packfile left at the end, i.e. that the detached gc actually did what it was supposed to do. Also add a comment at the end of the test script to warn developers of future tests about this issue of long running detached gc processes. Helped-by: Jeff King <peff@peff.net> Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13 12:31:38 +02:00
run_and_wait_for_auto_gc () {
# We read stdout from gc for the side effect of waiting until the
# background gc process exits, closing its fd 9. Furthermore, the
# variable assignment from a command substitution preserves the
# exit status of the main gc process.
# Note: this fd trickery doesn't work on Windows, but there is no
# need to, because on Win the auto gc always runs in the foreground.
doesnt_matter=$(git gc --auto 9>&1)
}
test_expect_success 'background auto gc does not run if gc.log is present and recent but does if it is old' '
test_commit foo &&
test_commit bar &&
git repack &&
test_config gc.autopacklimit 1 &&
test_config gc.autodetach true &&
echo fleem >.git/gc.log &&
test_must_fail git gc --auto 2>err &&
test_i18ngrep "^error:" err &&
test_config gc.logexpiry 5.days &&
test-tool chmtime =-345600 .git/gc.log &&
test_must_fail git gc --auto &&
test_config gc.logexpiry 2.days &&
t6500: wait for detached auto gc at the end of the test script The last test in 't6500-gc', 'background auto gc does not run if gc.log is present and recent but does if it is old', added in a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically trigger an error message from the test harness: rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty The test in question ends with executing an auto gc in the backround, which occasionally takes so long that it's still running when 'test_done' is about to remove the trash directory. This 'rm -rf $trash' in the foreground might race with the detached auto gc to create and delete files and directories, and gc might (re-)create a path that 'rm' already visited and removed, triggering the above error message when 'rm' attempts to remove its parent directory. Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) fixed the same problem in a different test script by simply disallowing background gc. Unfortunately, what worked there is not applicable here, because the purpose of this test is to check the behavior of a detached auto gc. Make sure that the test doesn't continue before the gc is finished in the background with a clever bit of shell trickery: - Open fd 9 in the shell, to be inherited by the background gc process, because our daemonize() only closes the standard fds 0, 1 and 2. - Duplicate this fd 9 to stdout. - Read 'git gc's stdout, and thus fd 9, through a command substitution. We don't actually care about gc's output, but this construct has two useful properties: - This read blocks until stdout or fd 9 are open. While stdout is closed after the main gc process creates the background process and exits, fd 9 remains open until the backround process exits. - The variable assignment from the command substitution gets its exit status from the command executed within the command substitution, i.e. a failing main gc process will cause the test to fail. Note, that this fd trickery doesn't work on Windows, because due to MSYS limitations the git process only inherits the standard fds 0, 1 and 2 from the shell. Luckily, it doesn't matter in this case, because on Windows daemonize() is basically a noop, thus 'git gc --auto' always runs in the foreground. And since we can now continue the test reliably after the detached gc finished, check that there is only a single packfile left at the end, i.e. that the detached gc actually did what it was supposed to do. Also add a comment at the end of the test script to warn developers of future tests about this issue of long running detached gc processes. Helped-by: Jeff King <peff@peff.net> Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13 12:31:38 +02:00
run_and_wait_for_auto_gc &&
ls .git/objects/pack/pack-*.pack >packs &&
test_line_count = 1 packs
'
gc: run pre-detach operations under lock We normally try to avoid having two auto-gc operations run at the same time, because it wastes resources. This was done long ago in 64a99eb47 (gc: reject if another gc is running, unless --force is given, 2013-08-08). When we do a detached auto-gc, we run the ref-related commands _before_ detaching, to avoid confusing lock contention. This was done by 62aad1849 (gc --auto: do not lock refs in the background, 2014-05-25). These two features do not interact well. The pre-detach operations are run before we check the gc.pid lock, meaning that on a busy repository we may run many of them concurrently. Ideally we'd take the lock before spawning any operations, and hold it for the duration of the program. This is tricky, though, with the way the pid-file interacts with the daemonize() process. Other processes will check that the pid recorded in the pid-file still exists. But detaching causes us to fork and continue running under a new pid. So if we take the lock before detaching, the pid-file will have a bogus pid in it. We'd have to go back and update it with the new pid after detaching. We'd also have to play some tricks with the tempfile subsystem to tweak the "owner" field, so that the parent process does not clean it up on exit, but the child process does. Instead, we can do something a bit simpler: take the lock only for the duration of the pre-detach work, then detach, then take it again for the post-detach work. Technically, this means that the post-detach lock could lose to another process doing pre-detach work. But in the long run this works out. That second process would then follow-up by doing post-detach work. Unless it was in turn blocked by a third process doing pre-detach work, and so on. This could in theory go on indefinitely, as the pre-detach work does not repack, and so need_to_gc() will continue to trigger. But in each round we are racing between the pre- and post-detach locks. Eventually, one of the post-detach locks will win the race and complete the full gc. So in the worst case, we may racily repeat the pre-detach work, but we would never do so simultaneously (it would happen via a sequence of serialized race-wins). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-11 11:06:35 +02:00
test_expect_success 'background auto gc respects lock for all operations' '
# make sure we run a background auto-gc
test_commit make-pack &&
git repack &&
test_config gc.autopacklimit 1 &&
test_config gc.autodetach true &&
# create a ref whose loose presence we can use to detect a pack-refs run
git update-ref refs/heads/should-be-loose HEAD &&
test_path_is_file .git/refs/heads/should-be-loose &&
# now fake a concurrent gc that holds the lock; we can use our
# shell pid so that it looks valid.
hostname=$(hostname || echo unknown) &&
printf "$$ %s" "$hostname" >.git/gc.pid &&
# our gc should exit zero without doing anything
run_and_wait_for_auto_gc &&
test_path_is_file .git/refs/heads/should-be-loose
'
t6500: wait for detached auto gc at the end of the test script The last test in 't6500-gc', 'background auto gc does not run if gc.log is present and recent but does if it is old', added in a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically trigger an error message from the test harness: rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty The test in question ends with executing an auto gc in the backround, which occasionally takes so long that it's still running when 'test_done' is about to remove the trash directory. This 'rm -rf $trash' in the foreground might race with the detached auto gc to create and delete files and directories, and gc might (re-)create a path that 'rm' already visited and removed, triggering the above error message when 'rm' attempts to remove its parent directory. Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) fixed the same problem in a different test script by simply disallowing background gc. Unfortunately, what worked there is not applicable here, because the purpose of this test is to check the behavior of a detached auto gc. Make sure that the test doesn't continue before the gc is finished in the background with a clever bit of shell trickery: - Open fd 9 in the shell, to be inherited by the background gc process, because our daemonize() only closes the standard fds 0, 1 and 2. - Duplicate this fd 9 to stdout. - Read 'git gc's stdout, and thus fd 9, through a command substitution. We don't actually care about gc's output, but this construct has two useful properties: - This read blocks until stdout or fd 9 are open. While stdout is closed after the main gc process creates the background process and exits, fd 9 remains open until the backround process exits. - The variable assignment from the command substitution gets its exit status from the command executed within the command substitution, i.e. a failing main gc process will cause the test to fail. Note, that this fd trickery doesn't work on Windows, because due to MSYS limitations the git process only inherits the standard fds 0, 1 and 2 from the shell. Luckily, it doesn't matter in this case, because on Windows daemonize() is basically a noop, thus 'git gc --auto' always runs in the foreground. And since we can now continue the test reliably after the detached gc finished, check that there is only a single packfile left at the end, i.e. that the detached gc actually did what it was supposed to do. Also add a comment at the end of the test script to warn developers of future tests about this issue of long running detached gc processes. Helped-by: Jeff King <peff@peff.net> Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13 12:31:38 +02:00
# DO NOT leave a detached auto gc process running near the end of the
# test script: it can run long enough in the background to racily
# interfere with the cleanup in 'test_done'.
test_done