Commit Graph

15 Commits

Author SHA1 Message Date
Peter Baumann
61b472ed8b git svn: reset invalidates the memoized mergeinfo caches
Since v1.7.0-rc2~11 (git-svn: persistent memoization, 2010-01-30),
git-svn has maintained some private per-repository caches in
.git/svn/.caches to avoid refetching and recalculating some
mergeinfo-related information with every 'git svn fetch'.

This memoization can cause problems, e.g consider the following case:

SVN repo:

  ... - a - b - c - m  <- trunk
          \        /
            d  -  e    <- branch1

The Git import of the above repo is at commit 'a' and doesn't know about
the branch1. In case of an 'git svn rebase', only the trunk of the
SVN repo is imported. During the creation of the git commit 'm', git svn
uses the svn:mergeinfo property and tries to find the corresponding git
commit 'e' to create 'm' with 'c' and 'e' as parents. But git svn rebase
only imports the current branch so commit 'e' is not imported.
Therefore git svn fails to create commit 'm' as a merge commit, because one
of its parents is not known to git. The imported history looks like this:

  ... - a - b - c - m  <- trunk

A later 'git svn fetch' to import all branches can't rewrite the commit 'm'
to add 'e' as a parent and to make it a real git merge commit, because it
was already imported.

That's why the imported history misses the merge and looks like this:

  ... - a - b - c - m  <- trunk
          \
            d  -  e    <- branch1

Right now the only known workaround for importing 'm' as a merge is to
force reimporting 'm' again from SVN, e.g. via

  $ git svn reset --revision $(git find-rev $c)
  $ git svn fetch

Sadly, this is where the behavior has regressed: git svn reset doesn't
invalidate the old mergeinfo cache, which is no longer valid for the
reimport, which leads to 'm' beeing imprted with only 'c' as parent.

As solution to this problem, this commit invalidates the mergeinfo cache
to force correct recalculation of the parents.

During development of this patch, several ways for invalidating the cache
where considered. One of them is to use Memoize::flush_cache, which will
call the CLEAR method on the underlying Memoize persistency implementation.
Sadly, neither Memoize::Storable nor the newer Memoize::YAML module
introduced in 68f532f4ba could optionally be used implement the
CLEAR method, so this is not an option.

Reseting the internal hash used to store the memoized values has the same
problem, because it calls the non-existing CLEAR method of the
underlying persistency layer, too.

Considering this and taking into account the different implementations
of the memoization modules, where Memoize::Storable is not in our control,
implementing the missing CLEAR method is not an option, at least not if
Memoize::Storable is still used.

Therefore the easiest solution to clear the cache is to delete the files
on disk in 'git svn reset'. Normally, deleting the files behind the back
of the memoization module would be problematic, because the in-memory
representation would still exist and contain wrong data. Fortunately, the
memoization is active in memory only for a small portion of the code.
Invalidating the cache by deleting the files on disk if it isn't active
should be safe.

Signed-off-by: Peter Baumann <waste.manager@gmx.de>
Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-08-10 19:53:18 +00:00
Michael G. Schwern
3d9be15fc2 Extract Git::SVN::GlobSpec from git-svn.
Straight cut & paste.  That's the last class.

* Make Git::SVN load it on its own, its the only thing that needs it.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:36:19 +00:00
Michael G. Schwern
10c2aa5928 Move Git::IndexInfo into its own file.
Straight cut & paste.  Didn't require any fixing.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:36:17 +00:00
Michael G. Schwern
b772cb9994 Extract Git::SVN::Migration from git-svn.
Straight cut & paste.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:36:14 +00:00
Michael G. Schwern
b74fda1c9b Extract Git::SVN::Log from git-svn.
Straight cut & paste.

Also noticed Git::SVN::Ra wasn't in the compile test.  It is now.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:36:06 +00:00
Michael G. Schwern
5c71028fce Move initialization of Git::SVN variables into Git::SVN.
Also it can compile on its own now, yay!

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:14:54 +00:00
Michael G. Schwern
29499c0b27 Extract Git::SVN from git-svn into its own .pm file.
Except for adding the 1; at the end, this is a straight copy & paste.

Tests still pass, but its doubtful Git::SVN will compile on its own
without git-svn being loaded.  Next commit will fix that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:14:53 +00:00
Michael G. Schwern
c2768fa152 Extract some utilities from git-svn to allow extracting Git::SVN.
Put them in a new module called Git::SVN::Utils.  Yeah, not terribly
original and it will be a dumping ground.  But its better than having
them in the main git-svn program.  At least they can be documented
and tested.

* fatal() is used by many classes.
* Change the $can_compress lexical into a function.

This should be enough to extract Git::SVN.

Signed-off-by: Michael G. Schwern <schwern@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-07-27 22:14:50 +00:00
Jonathan Nieder
68f532f4ba git-svn: use YAML format for mergeinfo cache when possible
Since v1.7.0-rc2~11 (git-svn: persistent memoization, 2010-01-30),
git-svn has maintained some private per-repository caches in
.git/svn/.caches to avoid refetching and recalculating some
mergeinfo-related information with every "git svn fetch".

These caches use the 'nstore' format from the perl core module
Storable, which can be read and written quickly and was designed for
transfer over the wire (the 'n' stands for 'network').  This format is
endianness-independent and independent of floating-point
representation.

Unfortunately the format is *not* independent of the perl version ---
new perl versions will write files that very old perl cannot read.
Worse, the format is not independent of the size of a perl integer.
So if you toggle perl's use64bitint compile-time option, then using
'git svn fetch' on your old repositories produces errors like this:

	Byte order is not compatible at ../../lib/Storable.pm (autosplit
	into ../../lib/auto/Storable/_retrieve.al) line 380, at
	/usr/share/perl/5.12/Memoize/Storable.pm line 21

That is, upgrading perl to a version that uses use64bitint for the
first time makes git-svn suddenly refuse to fetch in existing
repositories.  Removing .git/svn/.caches lets git-svn recover.

It's time to switch to a platform independent serializer backend with
better compatibility guarantees.  This patch uses YAML::Any.

Other choices were considered:

 - thawing data from Data::Dumper involves "eval".  Doing that without
   creating a security risk is fussy.

 - the JSON API works on scalars in memory and doesn't provide a
   standard way to serialize straight to disk.

YAML::Any is reasonably fast and has a pleasant API.  In most
backends, LoadFile() reads the entire file into a scalar anyway and
converts it as a second step, but having an interface that allows the
deserialization to happen on the fly without a temporary is still a
comfort.

YAML::Any is not a core perl module, so we take care to use it when
and only when it is available.  Installations without that module
should fall back to using Storable with all its quirks, keeping their
cache files in

	.git/svn/.caches/*.db

Installations with YAML peacefully coexist by keeping a separate set
of cache files in

	.git/svn/.caches/*.yaml.

In most cases, switching between is a one-time thing, so it doesn't
seem worth the complication to migrate existing caches.

The upshot: after this patch, as long as YAML::Any is installed you
can move your git repository between machines with different perl
installations and "git svn fetch" will work fine.  If you do not have
YAML::Any, the behavior is unchanged (and in particular does not get
any worse).

Reported-by: Sandro Weiser <sandro.weiser@informatik.tu-chemnitz.de>
Reported-by: Bdale Garbee <bdale@gag.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-06-10 08:47:53 +00:00
Jonathan Nieder
9f7ad1479d git-svn: make Git::SVN::RA a separate file
This slices off another 600 or so lines from the frighteningly long
git-svn.perl script.

The Git::SVN::Ra interface is similar enough to SVN::Ra that it is
probably safe to ignore most of its implementation on first reading.
(Documenting or moving functions that do not fit that pattern is left
as an exercise to the interested reader.)

[ew: rebased and fixed conflict against
 commit c26ddce86d
 (git-svn: platform auth providers are working only on 1.6.15 or newer)]

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-06-10 08:47:50 +00:00
Jonathan Nieder
8f9facfe94 git-svn: make Git::SVN::Editor a separate file
This makes the git-svn script shorter and less scary for beginners to
read through for the first time.  Take the opportunity to explain the
purpose and basic interface of the Git::SVN::Editor class while at it.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-06-10 08:45:56 +00:00
Jonathan Nieder
a6180325e8 git-svn: make Git::SVN::Fetcher a separate file
This patch removes a chunk of code (the Git::SVN::Fetcher consumer of
libsvn's tree delta protocol) from git-svn.perl and documents its
interface so the hurried reader does not have to read that code right
away.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-05-29 00:17:59 +00:00
Jonathan Nieder
c102f4cf72 git-svn: move Git::SVN::Prompt into its own file
git-svn.perl is very long (around 6500 lines) and although it is
nicely split into modules, some new readers do not even notice --- it
is too distracting to see all this functionality collected in a single
file.

Splitting it into multiple files would make it easier for people
to read individual modules straight through and to experiment with
components separately.

Let's start with Git::SVN::Prompt.  For simplicity, we install this as
a module in the standard search path, just like the existing Git and
Git::I18N modules.  In the process, add a manpage explaining its
interface and that it is not likely to be useful for other projects to
avoid confusion.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-05-29 00:17:59 +00:00
Ævar Arnfjörð Bjarmason
3e9c6a08c8 Git::I18N: compatibility with perl <5.8.3
Change the Exporter invocation in Git::I18N to be compatible with
5.8.0 to 5.8.2 inclusive. Before Exporter 5.57 (released with 5.8.3)
Exporter didn't export the 'import' subroutine.

Reported-by: Tom G. Christensen <tgc@statsbiblioteket.dk>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-10 13:25:41 -08:00
Ævar Arnfjörð Bjarmason
5e9637c629 i18n: add infrastructure for translating Git with gettext
Change the skeleton implementation of i18n in Git to one that can show
localized strings to users for our C, Shell and Perl programs using
either GNU libintl or the Solaris gettext implementation.

This new internationalization support is enabled by default. If
gettext isn't available, or if Git is compiled with
NO_GETTEXT=YesPlease, Git falls back on its current behavior of
showing interface messages in English. When using the autoconf script
we'll auto-detect if the gettext libraries are installed and act
appropriately.

This change is somewhat large because as well as adding a C, Shell and
Perl i18n interface we're adding a lot of tests for them, and for
those tests to work we need a skeleton PO file to actually test
translations. A minimal Icelandic translation is included for this
purpose. Icelandic includes multi-byte characters which makes it easy
to test various edge cases, and it's a language I happen to
understand.

The rest of the commit message goes into detail about various
sub-parts of this commit.

= Installation

Gettext .mo files will be installed and looked for in the standard
$(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to
override that, but that's only intended to be used to test Git itself.

= Perl

Perl code that's to be localized should use the new Git::I18n
module. It imports a __ function into the caller's package by default.

Instead of using the high level Locale::TextDomain interface I've
opted to use the low-level (equivalent to the C interface)
Locale::Messages module, which Locale::TextDomain itself uses.

Locale::TextDomain does a lot of redundant work we don't need, and
some of it would potentially introduce bugs. It tries to set the
$TEXTDOMAIN based on package of the caller, and has its own
hardcoded paths where it'll search for messages.

I found it easier just to completely avoid it rather than try to
circumvent its behavior. In any case, this is an issue wholly
internal Git::I18N. Its guts can be changed later if that's deemed
necessary.

See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for
a further elaboration on this topic.

= Shell

Shell code that's to be localized should use the git-sh-i18n
library. It's basically just a wrapper for the system's gettext.sh.

If gettext.sh isn't available we'll fall back on gettext(1) if it's
available. The latter is available without the former on Solaris,
which has its own non-GNU gettext implementation. We also need to
emulate eval_gettext() there.

If neither are present we'll use a dumb printf(1) fall-through
wrapper.

= About libcharset.h and langinfo.h

We use libcharset to query the character set of the current locale if
it's available. I.e. we'll use it instead of nl_langinfo if
HAVE_LIBCHARSET_H is set.

The GNU gettext manual recommends using langinfo.h's
nl_langinfo(CODESET) to acquire the current character set, but on
systems that have libcharset.h's locale_charset() using the latter is
either saner, or the only option on those systems.

GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either,
but MinGW and some others need to use libcharset.h's locale_charset()
instead.

=Credits

This patch is based on work by Jeff Epler <jepler@unpythonic.net> who
did the initial Makefile / C work, and a lot of comments from the Git
mailing list, including Jonathan Nieder, Jakub Narebski, Johannes
Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and
others.

[jc: squashed a small Makefile fix from Ramsay]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-05 20:46:55 -08:00