"git clone --dissociate" runs a big "git repack" process at the
end, and it helps to close file descriptors that are open on the
packs and their idx files before doing so on filesystems that
cannot remove a file that is still open.
* js/clone-dissociate:
clone --dissociate: avoid locking pack files
sha1_file.c: add a function to release all packs
sha1_file: consolidate code to close a pack's file descriptor
t5700: demonstrate a Windows file locking issue with `git clone --dissociate`
When `git clone` is asked to dissociate the repository from the
reference repository whose objects were used, it is quite possible that
the pack files need to be repacked. In that case, the pack files need to
be deleted that were originally hard-links to the reference repository's
pack files.
On platforms where a file cannot be deleted if another process still
holds a handle on it, we therefore need to take pains to release all
pack files and indexes before dissociating.
This fixes https://github.com/git-for-windows/git/issues/446
The test case to demonstrate the breakage technically does not need to
be run on Linux or MacOSX. It won't hurt, either, though.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, dissociating from a reference can fail very easily due to
pack files that are still in use when they want to be removed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The add_to_alternates_file function blindly uses
hold_lock_file_for_append to copy the existing contents, and
then adds the new line to it. This has two minor problems:
1. We might add duplicate entries, which are ugly and
inefficient.
2. We do not check that the file ends with a newline, in
which case we would bogusly append to the final line.
This is quite unlikely in practice, though, as we call
this function only from git-clone, so presumably we are
the only writers of the file (and we always add a
newline).
Instead of using hold_lock_file_for_append, let's copy the
file line by line, which ensures all records are properly
terminated. If we see an extra line, we can simply abort the
update (there is no point in even copying the rest, as we
know that it would be identical to the original).
As a bonus, we also get rid of some calls to the
static-buffer mkpath and git_path functions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The early part of this test is rather old, and does not
follow our usual style guidelines. In particular:
- the tests liberally chdir, and expect out-of-test "cd"
commands to return them to a sane state
- test commands aren't indented at all
- there are a lot of minor formatting nits, like the
opening quote of the test block on the wrong line,
spaces after ">", etc
This patch fixes the style issues, and uses a few helper
functions, along with subshells and "git -C", to avoid
changing the cwd of the main script.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While use of the --reference option to borrow objects from an
existing local repository of the same project is an effective way to
reduce traffic when cloning a project over the network, it makes the
resulting "borrowing" repository dependent on the "borrowed"
repository. After running
git clone --reference=P $URL Q
the resulting repository Q will be broken if the borrowed repository
P disappears.
The way to allow the borrowed repository to be removed is to repack
the borrowing repository (i.e. run "git repack -a -d" in Q); while
power users may know it very well, it is not easily discoverable.
Teach a new "--dissociate" option to "git clone" to run this
repacking for the user.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Try reading gitfile files when processing --reference options to clone.
This will allow, among other things, using a submodule checked out with
a recent version of git as a reference repository without requiring the
user to have internal knowledge of submodule layout.
Signed-off-by: Aaron Schrab <aaron@schrab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some test scripts use the GIT_TRACE mechanism to dump
debugging information to descriptor 3 (and point it to a
file using the shell). On Windows, however, bash is unable
to set up descriptor 3. We do not write our trace to the
file, and worse, we may interfere with other operations
happening on descriptor 3, causing tests to fail or even
behave inconsistently.
Prior to commit 97a83fa (upload-pack: remove packet debugging
harness), these tests used GIT_DEBUG_SEND_PACK, which only
supported output to a descriptor. The tests in t5503 were
always broken on Windows, and were marked to be skipped via
the NOT_MINGW prerequisite. In t5700, the tests used to pass
prior to 97a83fa, but only because they were not careful
enough; because we only grepped the trace file, an empty
file looked successful to us. But post-97a83fa, the writing
to descriptor 3 causes "git fetch" to hang (presumably
because we are throwing random bytes into the middle of the
protocol).
Now that we are using the GIT_TRACE mechanism, we can
improve both scripts by asking git to write directly to a
file rather than a descriptor. That fixes the hang in t5700,
and should allow t5503 to successfully run on Windows.
In both cases we now also use "test -s" to double-check that
our trace file actually contains output, which should reduce
the possibility of an erroneously passing test.
Signed-off-by: Jeff King <peff@peff.net>
Tested-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you set the GIT_DEBUG_SEND_PACK environment variable,
upload-pack will dump lines it receives in the receive_needs
phase to a descriptor. This debugging harness is a strict
subset of what GIT_TRACE_PACKET can do. Let's just drop it
in favor of that.
A few tests used GIT_DEBUG_SEND_PACK to confirm which
objects get sent; we have to adapt them to the new output
format.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prefer:
test_line_count <OP> COUNT FILE
over:
test $(wc -l <FILE) <OP> COUNT
(or similar usages) in several tests.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Objects in an alternate object database are already available to the
local repository and therefore don't need to be fetched. So mark them
as complete in everything_local().
This fixes a test in t5700.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an alternate supplies some, but not all, of the objects needed for
a fetch, fetch-pack nevertheless generates "want" lines for the
alternate objects that are present. Demonstrate this problem via a
failing test.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The add_extra_ref() interface is used to add an extra-ref that is _not_
our ref for the purpose of helping auto-following of tags and reducing
object transfer from remote repository, and they are typically formatted
as a tagname followed by ^{} to make sure no valid refs match that
pattern. In other words, these entries are deliberately formatted not to
pass check-refname-format test.
A recent series however added a test unconditionally to the add_ref()
function that is called from add_extra_ref(). The check may be sensible
for other two callsites of the add_ref() interface, but definitely is
a wrong thing to do in add_extra_ref(). Disable it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
In tests, call test_cmp rather than raw diff where possible (i.e. if
the output does not go to a pipe), to allow the use of, say, 'cmp'
when the default 'diff -u' is not compatible with a vendor diff.
When that is not possible, use $DIFF, as set in GIT-BUILD-OPTIONS.
Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* db/clone-in-c:
Add test for cloning with "--reference" repo being a subset of source repo
Add a test for another combination of --reference
Test that --reference actually suppresses fetching referenced objects
clone: fall back to copying if hardlinking fails
builtin-clone.c: Need to closedir() in copy_or_link_directory()
builtin-clone: fix initial checkout
Build in clone
Provide API access to init_db()
Add a function to set a non-default work tree
Allow for having for_each_ref() list extra refs
Have a constant extern refspec for "--tags"
Add a library function to add an alternate to the alternates file
Add a lockfile function to append to a file
Mark the list of refs to fetch as const
Conflicts:
cache.h
t/t5700-clone-reference.sh
The first test in this series tests "git clone -l -s --reference B A C",
where repo B is a superset of repo A (A has one commit, B has the same
commit plus another). In this case, all objects to be cloned are already
present in B.
However, we should also test the case where the "--reference" repo is a
_subset_ of the source repo (e.g. "git clone -l -s --reference A B C"),
i.e. some objects are not available in the "--reference" repo, and will
have to be found in the source repo.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this case, the reference repository has some useful loose objects,
but not all useful objects, and we make sure that we can find the
objects we fetch from the repository we're cloning in the new
repository, instead of potentially being distracted by the reference
repository.
Doing the wrong thing in a builtin-clone implementation would lead to
this looking for an object in the wrong place, not finding it (because
it's only in the right place), and crashing.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes the remainder of the issues where the test script itself is at
fault for failing when the git checkout path contains whitespace or other
shell metacharacters.
The majority of git svn tests used the idiom
test_expect_success "title" "test script using $svnrepo"
These were changed to have the test script in single-quotes:
test_expect_success "title" 'test script using "$svnrepo"'
which unfortunately makes the patch appear larger than it really is.
One consequence of this change is that in the verbose test output the
value of $svnrepo (and in some cases other variables, too) is no
longer expanded, i.e. previously we saw
* expecting success:
test script using /path/to/git/t/trash/svnrepo
but now it is:
* expecting success:
test script using "$svnrepo"
Signed-off-by: Bryan Donlan <bdonlan@fushizen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing "git fetch <remote>" on a remote that does not have the
branch referenced in branch.<current-branch>.merge, git fetch failed.
It failed because it tried to add the "merge" ref to the refs to be
fetched.
Fix that. And add a test case.
Incidentally, this unconvered a bug in our own test suite, where
"git pull <some-path>" was expected to merge the ref given in the
defaults, even if not pulling from the default remote.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This changes the behaviour of cloning from a repository on the
local machine, by defaulting to "-l" (use hardlinks to share
files under .git/objects) and making "-l" a no-op. A new
option, --no-hardlinks, is also added to cause file-level copy
of files under .git/objects while still avoiding the normal
"pack to pipe, then receive and index pack" network transfer
overhead. The old behaviour of local cloning without -l nor -s
is availble by specifying the source repository with the newly
introduced file:///path/to/repo.git/ syntax (i.e. "same as
network" cloning).
* With --no-hardlinks (i.e. have all .git/objects/ copied via
cpio) would not catch the source repository corruption, and
also risks corrupted recipient repository if an
alpha-particle hits memory cell while indexing and resolving
deltas. As long as the recipient is created uncorrupted, you
have a good back-up.
* same-as-network is expensive, but it would catch the breakage
of the source repository. It still risks corrupted recipient
repository due to hardware failure. As long as the recipient
is created uncorrupted, you have a good back-up.
* The new default on the same filesystem, as long as the source
repository is healthy, it is very likely that the recipient
would be, too. Also it is very cheap. You do not get any
back-up benefit, though.
None of the method is resilient against the source repository
corruption, so let's discount that from the comparison. Then
the difference with and without --no-hardlinks matters primarily
if you value the back-up benefit or not. If you want to use the
cloned repository as a back-up, then it is cheaper to do a clone
with --no-hardlinks and two git-fsck (source before clone,
recipient after clone) than same-as-network clone, especially as
you are likely to do a git-fsck on the recipient if you are so
paranoid anyway.
Which leads me to believe that being able to use file:/// is
probably a good idea, if only for testability, but probably of
little practical value. We default to hardlinked clone for
everyday use, and paranoids can use --no-hardlinks as a way to
make a back-up.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using --reference to borrow objects from a neighbouring
repository while cloning, we copy the entire set of refs under
temporary "refs/reference-tmp/refs" space and set up the object
alternates. However, a textual symref copied this way would not
point at the right place, and causes later steps to emit error
messages (which is harmless but still alarming). This is most
visible when using a clone created with the separate-remote
layout as a reference, because such a repository would have
refs/remotes/origin/HEAD with 'ref: refs/remotes/origin/master'
as its contents.
Although we do not create symbolic-link based refs anymore, they
have the same problem because they are always supposed to be
relative to refs/ hierarchy (we dereference by hand, so it only
is good for HEAD and nothing else).
In either case, the solution is simply to remove them after
copying under refs/reference-tmp; if a symref points at a true
ref, that true ref itself is enough to ensure that objects
reachable from it do not needlessly get fetched.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This reverts commit 9b088c4e39.
Protecting 'mature' objects does not make it any safer. We should
admit that git-prune is inherently unsafe when run in parallel with
other operations without involving unwarranted locking overhead,
and with the latest git, even rebase and reset would not immediately
create crufts anyway.
This option gives grace period to objects that are unreachable
from the refs from getting pruned.
The default value is 24 hours and may be changed using
gc.prunegrace.
Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The only visible change is that git-blame doesn't understand
"--compability" anymore, but it does accept "--compatibility" instead,
which is already documented.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Both -l -s and --reference update objects/info/alternates and used
to write over each other.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>