Since b19138b (git-svn: Make it incrementally faster by minimizing temp
files, v1.6.0), git-svn has been using the Git.pm temp_acquire and
temp_release mechanism to avoid unnecessary temp file churn and provide
a speed boost.
However, that change introduced a call to temp_acquire inside the
Git::SVN::Fetcher::close_file function for an 'svn_hash' temp file.
Because an SVN::Pool is active at the time this function is called, if
the Git::temp_acquire function ends up actually creating a new
FileHandle for the temp file (which it will the first time it's called
with the name 'svn_hash') that FileHandle will end up in the SVN::Pool
and should that pool have SVN::Pool::clear called on it that FileHandle
will be closed out from under Git::temp_acquire.
Since the only call site to Git::temp_acquire with the name 'svn_hash'
is inside the close_file function, if an 'svn_hash' temp file is ever
created its FileHandle is guaranteed to be created in the active
SVN::Pool.
This has not been a problem in the past because the SVN::Pool was not
being cleared. However, since dfa72fdb (git-svn: reload RA every
log-window-size, v2.2.0) the pool has been getting cleared periodically
at which point the FileHandle for the 'svn_hash' temp file gets closed.
Any subsequent calls to Git::temp_acquire for 'svn_hash', however,
succeed without creating/opening a new temporary file since it still has
the now invalid FileHandle in its cache. Callers that then attempt to
use that FileHandle fail with an error.
We avoid this problem by making sure the 'svn_hash' temp file is created
in the same place the 'svn_delta_...' and 'git_blob_...' temp files are
(and then temp_release'd) so that it can be safely used inside the
close_file function without having its FileHandle end up in an SVN::Pool
that gets cleared.
Additionally the Git.pm cat_blob function creates a bidirectional pipe
FileHandle using the IPC::Open2::open2 function. If that handle is
created too late, it also gets caught up in the SVN::Pool and incorrectly
closed by the SVN::Pool::clear call. But this only seems to happen with
more recent versions of Perl and svn.
To avoid this problem we add an explicit call to _open_cat_blob_if_needed
before the first call to SVN::Pool->new_default to make sure the open2
handle does not end up in the SVN::Pool.
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids the following failure with normal "get_dir" on newer
versions of SVN (tested with SVN 1.8.8-1ubuntu3.1):
Incorrect parameters given: Could not convert '%ld' into a number
get_dir2 also has the potential to be more efficient by requesting
less data.
ref: <1414636504.45506.YahooMailBasic@web172304.mail.ir2.yahoo.com>
ref: <1414722617.89476.YahooMailBasic@web172305.mail.ir2.yahoo.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Memoizing these initialization functions saves some memory for
long fetches which require scanning many unwanted revisions
before any wanted revisions happen.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
There is no reason to keep entries in the %revs hash after we're
done processing a revision, so allow entries become freed as
processing continues.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
This override was probably never necessary, but most likely a no-op
as it does not appear to do anything in SVN::Ra itself.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Despite attempting to use local memory pools everywhere we can,
(including our call to SVN::Ra::do_update and all subsequent reporter
calls), there does not appear to be a way to force the Git::SVN::Fetcher
callbacks to use a pool other than the per-SVN::Ra pool.
Git::SVN::Fetcher ends up using the main RA pool which grows
monotonically in size for the lifetime of the RA object.
Thus the only way to free that memory appears to be to destroy and
recreate the RA connection for at every --log-window-size interval.
This reduces memory usage over the course of fetching 10K revisions
using a test repository created with the script at the end of this
commit message.
As reported by time(1) on my x86-64 system:
before: 54024k
after: 28680k
Unfortunately, there remains some yet-to-be-tracked-down slow memory
growth which would be evident as the `nr' parameter increases in
the repository generation script:
-----------------------------8<------------------------------
set -e
tmp=$(mktemp -d svntestrepo-XXXXXXXX)
svnadmin create "$tmp"
repo=file://"$(cd $tmp && pwd)"
svn co "$repo" "$tmp/wd"
cd "$tmp/wd"
if ! test -f a
then
> a
svn add a
svn commit -m 'A'
fi
nr=10000
while test $nr -gt 0
do
echo $nr > a
svn commit -q -m A
nr=$((nr - 1))
done
echo "repository created in $repo"
-----------------------------8<------------------------------
Signed-off-by: Eric Wong <normalperson@yhbt.net>
git-svn used in combination with serf to talk to svn repository
served over HTTPS dumps core on termination.
This is caused by a bug in serf, and the most recent serf release
1.3.1 still exhibits the problem; a fix for the bug exists (see
https://code.google.com/p/serf/source/detail?r=2146).
Until the bug is fixed, work around the issue within the git perl
module Ra.pm by freeing the private copy of the remote access object
on termination, which seems to be sufficient to prevent the error
from happening.
Note: Since subversion-1.8.0 and later do require serf-1.2.1 or
later, this issue typically shows up when upgrading to a recent
version of subversion.
Credits go to Jonathan Lambrechts for proposing a fix to Ra.pm,
Evgeny Kotkov and Ivan Zhakov for fixing the issue in serf and
pointing me to that fix.
Signed-off-by: Uli Heller <uli.heller@daemons-point.com>
Tested-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
lexgrog(1) relies on the NAME section to find a manpage's subject's
name and description for easy access later using "man -k". Add the
section it expects.
Noticed using lintian.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
This originates from an msysgit pull request, see:
https://github.com/msysgit/git/pull/58
Signed-off-by: Eric Wieser <wieser.eric@gmail.com>
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
The accessors should improve maintainability and enforce
consistent access to Git::SVN objects.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Go through all the spots that use the new add_path_to_url() to
make a new URL and canonicalize them.
* copyfrom_path has to be canonicalized else find_parent_branch
will get confused
* due to the `canonicalize_url($full_url) ne $full_url)` line of
logic in gs_do_switch(), $full_url is left alone until after.
At this point SVN 1.7 passes except for 3 tests in
t9100-git-svn-basic.sh that look like an SVN bug to do with
symlinks.
[ew: commit title]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Remove the ad-hoc versions.
This is mostly to normalize the process and ensure the URLs produced
don't have double slashes or anything.
Also provides a place to fix the corner case where a file path
contains a percent sign.
[ew: commit title]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
The old hand-rolled URL escape functions were inferior to
canonicalization functions.
Continuing to move towards getting everything canonicalizing the same way.
* Git::SVN->init_remote_config and Git::SVN::Ra->minimize_url both
have to canonicalize the same way else init_remote_config
will incorrectly think they're different URLs causing
t9107-git-svn-migrate.sh to fail.
[ew: commit title]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
This canonicalizes paths and urls as early as possible so we don't
have to remember to do it at the point of use. It will fix a swath
of SVN 1.7 problems in one go.
Its ok to double canonicalize things.
SVN 1.7 still fails, still not worrying about that.
[ew: commit title]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Later it can canonicalize automatically.
A later change will make other things use the accessor.
No functional change.
[ew: commit title, reformatted accessor to match existing style]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
This slices off another 600 or so lines from the frighteningly long
git-svn.perl script.
The Git::SVN::Ra interface is similar enough to SVN::Ra that it is
probably safe to ignore most of its implementation on first reading.
(Documenting or moving functions that do not fit that pattern is left
as an exercise to the interested reader.)
[ew: rebased and fixed conflict against
commit c26ddce86d
(git-svn: platform auth providers are working only on 1.6.15 or newer)]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>