6f054f9fb3
When cloning a repository with `--local`, Git relies on either making a
hardlink or copy to every file in the "objects" directory of the source
repository. This is done through the callpath `cmd_clone()` ->
`clone_local()` -> `copy_or_link_directory()`.
The way this optimization works is by enumerating every file and
directory recursively in the source repository's `$GIT_DIR/objects`
directory, and then either making a copy or hardlink of each file. The
only exception to this rule is when copying the "alternates" file, in
which case paths are rewritten to be absolute before writing a new
"alternates" file in the destination repo.
One quirk of this implementation is that it dereferences symlinks when
cloning. This behavior was most recently modified in 36596fd2df
(clone:
better handle symlinked files at .git/objects/, 2019-07-10), which
attempted to support `--local` clones of repositories with symlinks in
their objects directory in a platform-independent way.
Unfortunately, this behavior of dereferencing symlinks (that is,
creating a hardlink or copy of the source's link target in the
destination repository) can be used as a component in attacking a
victim by inadvertently exposing the contents of file stored outside of
the repository.
Take, for example, a repository that stores a Dockerfile and is used to
build Docker images. When building an image, Docker copies the directory
contents into the VM, and then instructs the VM to execute the
Dockerfile at the root of the copied directory. This protects against
directory traversal attacks by copying symbolic links as-is without
dereferencing them.
That is, if a user has a symlink pointing at their private key material
(where the symlink is present in the same directory as the Dockerfile,
but the key itself is present outside of that directory), the key is
unreadable to a Docker image, since the link will appear broken from the
container's point of view.
This behavior enables an attack whereby a victim is convinced to clone a
repository containing an embedded submodule (with a URL like
"file:///proc/self/cwd/path/to/submodule") which has a symlink pointing
at a path containing sensitive information on the victim's machine. If a
user is tricked into doing this, the contents at the destination of
those symbolic links are exposed to the Docker image at runtime.
One approach to preventing this behavior is to recreate symlinks in the
destination repository. But this is problematic, since symlinking the
objects directory are not well-supported. (One potential problem is that
when sharing, e.g. a "pack" directory via symlinks, different writers
performing garbage collection may consider different sets of objects to
be reachable, enabling a situation whereby garbage collecting one
repository may remove reachable objects in another repository).
Instead, prohibit the local clone optimization when any symlinks are
present in the `$GIT_DIR/objects` directory of the source repository.
Users may clone the repository again by prepending the "file://" scheme
to their clone URL, or by adding the `--no-local` option to their `git
clone` invocation.
The directory iterator used by `copy_or_link_directory()` must no longer
dereference symlinks (i.e., it *must* call `lstat()` instead of `stat()`
in order to discover whether or not there are symlinks present). This has
no bearing on the overall behavior, since we will immediately `die()` on
encounter a symlink.
Note that t5604.33 suggests that we do support local clones with
symbolic links in the source repository's objects directory, but this
was likely unintentional, or at least did not take into consideration
the problem with sharing parts of the objects directory with symbolic
links at the time. Update this test to reflect which options are and
aren't supported.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
345 lines
8.8 KiB
Bash
Executable File
345 lines
8.8 KiB
Bash
Executable File
#!/bin/sh
|
|
#
|
|
# Copyright (C) 2006 Martin Waitz <tali@admingilde.org>
|
|
#
|
|
|
|
test_description='test clone --reference'
|
|
. ./test-lib.sh
|
|
|
|
base_dir=$(pwd)
|
|
|
|
U=$base_dir/UPLOAD_LOG
|
|
|
|
# create a commit in repo $1 with name $2
|
|
commit_in () {
|
|
(
|
|
cd "$1" &&
|
|
echo "$2" >"$2" &&
|
|
git add "$2" &&
|
|
git commit -m "$2"
|
|
)
|
|
}
|
|
|
|
# check that there are $2 loose objects in repo $1
|
|
test_objcount () {
|
|
echo "$2" >expect &&
|
|
git -C "$1" count-objects >actual.raw &&
|
|
cut -d' ' -f1 <actual.raw >actual &&
|
|
test_cmp expect actual
|
|
}
|
|
|
|
test_expect_success 'preparing first repository' '
|
|
test_create_repo A &&
|
|
commit_in A file1
|
|
'
|
|
|
|
test_expect_success 'preparing second repository' '
|
|
git clone A B &&
|
|
commit_in B file2 &&
|
|
git -C B repack -ad &&
|
|
git -C B prune
|
|
'
|
|
|
|
test_expect_success 'cloning with reference (-l -s)' '
|
|
git clone -l -s --reference B A C
|
|
'
|
|
|
|
test_expect_success 'existence of info/alternates' '
|
|
test_line_count = 2 C/.git/objects/info/alternates
|
|
'
|
|
|
|
test_expect_success 'pulling from reference' '
|
|
git -C C pull ../B master
|
|
'
|
|
|
|
test_expect_success 'that reference gets used' '
|
|
test_objcount C 0
|
|
'
|
|
|
|
test_expect_success 'cloning with reference (no -l -s)' '
|
|
GIT_TRACE_PACKET=$U.D git clone --reference B "file://$(pwd)/A" D
|
|
'
|
|
|
|
test_expect_success 'fetched no objects' '
|
|
test -s "$U.D" &&
|
|
! grep " want" "$U.D"
|
|
'
|
|
|
|
test_expect_success 'existence of info/alternates' '
|
|
test_line_count = 1 D/.git/objects/info/alternates
|
|
'
|
|
|
|
test_expect_success 'pulling from reference' '
|
|
git -C D pull ../B master
|
|
'
|
|
|
|
test_expect_success 'that reference gets used' '
|
|
test_objcount D 0
|
|
'
|
|
|
|
test_expect_success 'updating origin' '
|
|
commit_in A file3 &&
|
|
git -C A repack -ad &&
|
|
git -C A prune
|
|
'
|
|
|
|
test_expect_success 'pulling changes from origin' '
|
|
git -C C pull origin
|
|
'
|
|
|
|
# the 2 local objects are commit and tree from the merge
|
|
test_expect_success 'that alternate to origin gets used' '
|
|
test_objcount C 2
|
|
'
|
|
|
|
test_expect_success 'pulling changes from origin' '
|
|
git -C D pull origin
|
|
'
|
|
|
|
# the 5 local objects are expected; file3 blob, commit in A to add it
|
|
# and its tree, and 2 are our tree and the merge commit.
|
|
test_expect_success 'check objects expected to exist locally' '
|
|
test_objcount D 5
|
|
'
|
|
|
|
test_expect_success 'preparing alternate repository #1' '
|
|
test_create_repo F &&
|
|
commit_in F file1
|
|
'
|
|
|
|
test_expect_success 'cloning alternate repo #2 and adding changes to repo #1' '
|
|
git clone F G &&
|
|
commit_in F file2
|
|
'
|
|
|
|
test_expect_success 'cloning alternate repo #1, using #2 as reference' '
|
|
git clone --reference G F H
|
|
'
|
|
|
|
test_expect_success 'cloning with reference being subset of source (-l -s)' '
|
|
git clone -l -s --reference A B E
|
|
'
|
|
|
|
test_expect_success 'cloning with multiple references drops duplicates' '
|
|
git clone -s --reference B --reference A --reference B A dups &&
|
|
test_line_count = 2 dups/.git/objects/info/alternates
|
|
'
|
|
|
|
test_expect_success 'clone with reference from a tagged repository' '
|
|
(
|
|
cd A && git tag -a -m tagged HEAD
|
|
) &&
|
|
git clone --reference=A A I
|
|
'
|
|
|
|
test_expect_success 'prepare branched repository' '
|
|
git clone A J &&
|
|
(
|
|
cd J &&
|
|
git checkout -b other master^ &&
|
|
echo other >otherfile &&
|
|
git add otherfile &&
|
|
git commit -m other &&
|
|
git checkout master
|
|
)
|
|
'
|
|
|
|
test_expect_success 'fetch with incomplete alternates' '
|
|
git init K &&
|
|
echo "$base_dir/A/.git/objects" >K/.git/objects/info/alternates &&
|
|
(
|
|
cd K &&
|
|
git remote add J "file://$base_dir/J" &&
|
|
GIT_TRACE_PACKET=$U.K git fetch J
|
|
) &&
|
|
master_object=$(cd A && git for-each-ref --format="%(objectname)" refs/heads/master) &&
|
|
test -s "$U.K" &&
|
|
! grep " want $master_object" "$U.K" &&
|
|
tag_object=$(cd A && git for-each-ref --format="%(objectname)" refs/tags/HEAD) &&
|
|
! grep " want $tag_object" "$U.K"
|
|
'
|
|
|
|
test_expect_success 'clone using repo with gitfile as a reference' '
|
|
git clone --separate-git-dir=L A M &&
|
|
git clone --reference=M A N &&
|
|
echo "$base_dir/L/objects" >expected &&
|
|
test_cmp expected "$base_dir/N/.git/objects/info/alternates"
|
|
'
|
|
|
|
test_expect_success 'clone using repo pointed at by gitfile as reference' '
|
|
git clone --reference=M/.git A O &&
|
|
echo "$base_dir/L/objects" >expected &&
|
|
test_cmp expected "$base_dir/O/.git/objects/info/alternates"
|
|
'
|
|
|
|
test_expect_success 'clone and dissociate from reference' '
|
|
git init P &&
|
|
(
|
|
cd P && test_commit one
|
|
) &&
|
|
git clone P Q &&
|
|
(
|
|
cd Q && test_commit two
|
|
) &&
|
|
git clone --no-local --reference=P Q R &&
|
|
git clone --no-local --reference=P --dissociate Q S &&
|
|
# removing the reference P would corrupt R but not S
|
|
rm -fr P &&
|
|
test_must_fail git -C R fsck &&
|
|
git -C S fsck
|
|
'
|
|
test_expect_success 'clone, dissociate from partial reference and repack' '
|
|
rm -fr P Q R &&
|
|
git init P &&
|
|
(
|
|
cd P &&
|
|
test_commit one &&
|
|
git repack &&
|
|
test_commit two &&
|
|
git repack
|
|
) &&
|
|
git clone --bare P Q &&
|
|
(
|
|
cd P &&
|
|
git checkout -b second &&
|
|
test_commit three &&
|
|
git repack
|
|
) &&
|
|
git clone --bare --dissociate --reference=P Q R &&
|
|
ls R/objects/pack/*.pack >packs.txt &&
|
|
test_line_count = 1 packs.txt
|
|
'
|
|
|
|
test_expect_success 'clone, dissociate from alternates' '
|
|
rm -fr A B C &&
|
|
test_create_repo A &&
|
|
commit_in A file1 &&
|
|
git clone --reference=A A B &&
|
|
test_line_count = 1 B/.git/objects/info/alternates &&
|
|
git clone --local --dissociate B C &&
|
|
! test -f C/.git/objects/info/alternates &&
|
|
( cd C && git fsck )
|
|
'
|
|
|
|
test_expect_success 'setup repo with garbage in objects/*' '
|
|
git init S &&
|
|
(
|
|
cd S &&
|
|
test_commit A &&
|
|
|
|
cd .git/objects &&
|
|
>.some-hidden-file &&
|
|
>some-file &&
|
|
mkdir .some-hidden-dir &&
|
|
>.some-hidden-dir/some-file &&
|
|
>.some-hidden-dir/.some-dot-file &&
|
|
mkdir some-dir &&
|
|
>some-dir/some-file &&
|
|
>some-dir/.some-dot-file
|
|
)
|
|
'
|
|
|
|
test_expect_success 'clone a repo with garbage in objects/*' '
|
|
for option in --local --no-hardlinks --shared --dissociate
|
|
do
|
|
git clone $option S S$option || return 1 &&
|
|
git -C S$option fsck || return 1
|
|
done &&
|
|
find S-* -name "*some*" | sort >actual &&
|
|
cat >expected <<-EOF &&
|
|
S--dissociate/.git/objects/.some-hidden-dir
|
|
S--dissociate/.git/objects/.some-hidden-dir/.some-dot-file
|
|
S--dissociate/.git/objects/.some-hidden-dir/some-file
|
|
S--dissociate/.git/objects/.some-hidden-file
|
|
S--dissociate/.git/objects/some-dir
|
|
S--dissociate/.git/objects/some-dir/.some-dot-file
|
|
S--dissociate/.git/objects/some-dir/some-file
|
|
S--dissociate/.git/objects/some-file
|
|
S--local/.git/objects/.some-hidden-dir
|
|
S--local/.git/objects/.some-hidden-dir/.some-dot-file
|
|
S--local/.git/objects/.some-hidden-dir/some-file
|
|
S--local/.git/objects/.some-hidden-file
|
|
S--local/.git/objects/some-dir
|
|
S--local/.git/objects/some-dir/.some-dot-file
|
|
S--local/.git/objects/some-dir/some-file
|
|
S--local/.git/objects/some-file
|
|
S--no-hardlinks/.git/objects/.some-hidden-dir
|
|
S--no-hardlinks/.git/objects/.some-hidden-dir/.some-dot-file
|
|
S--no-hardlinks/.git/objects/.some-hidden-dir/some-file
|
|
S--no-hardlinks/.git/objects/.some-hidden-file
|
|
S--no-hardlinks/.git/objects/some-dir
|
|
S--no-hardlinks/.git/objects/some-dir/.some-dot-file
|
|
S--no-hardlinks/.git/objects/some-dir/some-file
|
|
S--no-hardlinks/.git/objects/some-file
|
|
EOF
|
|
test_cmp expected actual
|
|
'
|
|
|
|
test_expect_success SYMLINKS 'setup repo with manually symlinked or unknown files at objects/' '
|
|
git init T &&
|
|
(
|
|
cd T &&
|
|
git config gc.auto 0 &&
|
|
test_commit A &&
|
|
git gc &&
|
|
test_commit B &&
|
|
|
|
cd .git/objects &&
|
|
mv pack packs &&
|
|
ln -s packs pack &&
|
|
find ?? -type d >loose-dirs &&
|
|
last_loose=$(tail -n 1 loose-dirs) &&
|
|
mv $last_loose a-loose-dir &&
|
|
ln -s a-loose-dir $last_loose &&
|
|
first_loose=$(head -n 1 loose-dirs) &&
|
|
rm -f loose-dirs &&
|
|
|
|
cd $first_loose &&
|
|
obj=$(ls *) &&
|
|
mv $obj ../an-object &&
|
|
ln -s ../an-object $obj &&
|
|
|
|
cd ../ &&
|
|
echo unknown_content >unknown_file
|
|
) &&
|
|
git -C T fsck &&
|
|
git -C T rev-list --all --objects >T.objects
|
|
'
|
|
|
|
|
|
test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at objects/' '
|
|
# None of these options work when cloning locally, since T has
|
|
# symlinks in its `$GIT_DIR/objects` directory
|
|
for option in --local --no-hardlinks --dissociate
|
|
do
|
|
test_must_fail git clone $option T T$option 2>err || return 1 &&
|
|
test_i18ngrep "symlink.*exists" err || return 1
|
|
done &&
|
|
|
|
# But `--shared` clones should still work, even when specifying
|
|
# a local path *and* that repository has symlinks present in its
|
|
# `$GIT_DIR/objects` directory.
|
|
git clone --shared T T--shared &&
|
|
git -C T--shared fsck &&
|
|
git -C T--shared rev-list --all --objects >T--shared.objects &&
|
|
test_cmp T.objects T--shared.objects &&
|
|
(
|
|
cd T--shared/.git/objects &&
|
|
find . -type f | sort >../../../T--shared.objects-files.raw &&
|
|
find . -type l | sort >../../../T--shared.objects-symlinks.raw
|
|
) &&
|
|
|
|
for raw in $(ls T*.raw)
|
|
do
|
|
sed -e "s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" -e "/commit-graph/d" \
|
|
-e "/multi-pack-index/d" <$raw >$raw.de-sha-1 &&
|
|
sort $raw.de-sha-1 >$raw.de-sha || return 1
|
|
done &&
|
|
|
|
echo ./info/alternates >expected-files &&
|
|
test_cmp expected-files T--shared.objects-files.raw &&
|
|
test_must_be_empty T--shared.objects-symlinks.raw
|
|
'
|
|
|
|
test_done
|