Commit Graph

93 Commits

Author SHA1 Message Date
Martin Langhoff
6211988f77 cvsimport: skip commits that are too recent
With this patch, cvsimport will skip commits made
in the last 10 minutes. The recent-ness test is of
5 minutes + cvsps fuzz window (5 minutes default).

When working with a CVS repository that is in use,
importing commits that are too recent can lead to
partially incorrect trees. This is mainly due to

 - Commits that are within the cvsps fuzz window may later
   be found to have affected more files.

 - When performing incremental imports, clock drift between
   the systems may lead to skipped commits.

This commit helps keep incremental imports of in-use
CVS repositories sane.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-07 18:06:49 -08:00
Junio C Hamano
86d11cf264 cvsimport: style fixup.
This should not change any functionality, but just makes it readable by
having a space between syntactic construct keyword and open parenthesis
(e.g. "if (expr", not "if(expr") and between close parenthesis and open
brace (e.g. "if (expr) {" not "if (expr){").

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-27 14:21:30 -08:00
Iñaki Arenaza
73bcf53342 git-cvsimport: add support for CVS pserver method HTTP/1.x proxying
This patch adds support for 'proxy' and 'proxyport' connection options
when using the pserver method for the CVS Root.

It has been tested with a Squid 2.5.x proxy server.

Quoting from the CVS info manual:

     The `gserver' and `pserver' connection methods all accept optional
  method options, specified as part of the METHOD string, like so:

       :METHOD[;OPTION=ARG...]:

     Currently, the only two valid connection options are `proxy', which
  takes a hostname as an argument, and `proxyport', which takes a port
  number as an argument.  These options can be used to connect via an HTTP
  tunnel style web proxy.  For example, to connect pserver via a web proxy
  at www.myproxy.net and port 8000, you would use a method of:

       :pserver;proxy=www.myproxy.net;proxyport=8000:

     *NOTE: The rest of the connection string is required to connect to
  the server as noted in the upcoming sections on password authentication,
  gserver and kserver.  The example above would only modify the METHOD
  portion of the repository name.*

     PROXY must be supplied to connect to a CVS server via a proxy
  server, but PROXYPORT will default to port 8080 if not supplied.
  PROXYPORT may also be set via the CVS_PROXY_PORT environment variable.

Signed-off-by: Iñaki Arenaza <iarenuno@eteo.mondragon.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-24 02:21:29 -08:00
Jim Meyering
efe4abd14c Run "git repack -a -d" once more at end, if there's 1MB or more of not-packed data.
Although I converted upstream coreutils to git last month, I just
reconverted coreutils once again, as a test, and ended up with a
git repository of about 130MB (contrast with my packed git repo of
size 52MB).  That was because there were a lot of commits (but < 1024)
after the final automatic "git-repack -a -d".

Running a final
  git-repack -a -d && git-prune-packed
cut the final repository size down to the expected size.

So this looks like an easy way to improve git-cvsimport.
Just run "git repack ..." at the end if there's more than
some reasonable amount of not-packed data.

My choice of 1MB is a little arbitrarily.  I wouldn't mind missing
the minimal repo size by 1MB.  At the other end of the spectrum,
it's probably not worthwhile to pack everything when the total
repository size is less than 1MB.

Here's the patch:

Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-15 12:50:29 -08:00
Andy Whitcroft
1f24c58724 cvsimport: move over to using git-for-each-ref to read refs.
cvsimport opens all of the files in $GIT_DIR/refs/heads and reads
out the sha1's in order to work out what time the last commit on
that branch was made (in CVS) thus allowing incremental updates.
However, this takes no account of hierachical refs naming producing
the following error for each directory in $GIT_DIR/refs:

  Use of uninitialized value in chomp at /usr/bin/git-cvsimport line 503.
  Use of uninitialized value in concatenation (.) or string at
					/usr/bin/git-cvsimport line 505.
  usage: git-cat-file [-t|-s|-e|-p|<type>] <sha1>

Take advantage of the new packed refs work to use the new
for-each-ref iterator to get this information.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 10:21:46 -07:00
Martin Langhoff
c5f448b0f2 cvsimport - cleanup of the multi-indexes handling
Indexes are only needed when we are about preparing to commit. Prime them
inside commit() when we have all the info we need, and remove all the
redundant index setups.

While we are at it, make sure that index handling is correct when opening
new branches, and on initial import.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-28 03:53:37 -07:00
Johannes Schindelin
061303f0b5 cvsimport: always set $ENV{GIT_INDEX_FILE} to $index{$branch}
Also, make sure that the initial git-read-tree is performed.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2006-06-24 20:08:25 -07:00
Martin Langhoff
7ccd9009ac cvsimport: setup indexes correctly for ancestors and incremental imports
Two bugs had slipped in the "keep one index per branch during import"
patch. Both incremental imports and new branches would see an
empty tree for their initial commit. Now we cover all the relevant
cases, checking whether we actually need to setup the index before
preparing the actual commit, and doing it.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 05:30:06 -07:00
Martin Langhoff
8f732649bc cvsimport: keep one index per branch during import
With this patch we have a speedup and much lower IO when
importing trees with many branches. Instead of forcing
index re-population for each branch switch, we keep
many index files around, one per branch.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-16 22:45:11 -07:00
Martin Langhoff
2f57c69792 cvsimport: complete the cvsps run before starting the import
We now capture the output of cvsps to a tempfile, and then read it in.
cvsps 2.1 works quite a bit "in memory", and only prints its patchset
info once it has finished talking with cvs, but apparently retaining
all that memory allocation. With this patch, cvsps is finished and
reaped before cvsimport start working (and growing). So the footprint
of the whole process is much lower.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-16 22:45:11 -07:00
Martin Langhoff
71b0814836 cvsimport: ignore CVSPS_NO_BRANCH and impossible branches
cvsps output often contains references to CVSPS_NO_BRANCH, commits
that it could not trace to a branch. Ignore that branch.

Additionally, cvsps will sometimes draw circular relationships
between branches -- where two branches are recorded as opening
from the other.  In those cases, and where the ancestor branch
hasn't been seen, ignore it.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-16 22:45:11 -07:00
Jeff King
e49289dfb7 cvsimport: avoid "use" with :tag
Avoid "use POSIX qw(strftime dup2 :errno_h)"; it was reported
that a Perl installations on Mandrake 9.1 did not like it, even
though it understood "use POSIX qw(:errno_h)".  Funny.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-25 00:18:42 -07:00
Jeff King
62bf0d9629 cvsimport: set up commit environment in perl instead of using env
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 16:43:12 -07:00
Junio C Hamano
61efa5e300 cvsimport: do not barf on creation of an empty file.
When the server says "created this file whose length is empty",
we mistakenly said "oops, the server did not say a sensible
thing".  Fix it.

Spotted and fixed by Linus, acked by Martin.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 16:30:39 -07:00
Martin Langhoff
55cad84299 cvsimport: introduce _fetchfile() method and used a 1M buffer to read()
File retrieval from the socket is now moved to _fetchfile() and we now
cap reads at 1MB. This should limit the memory growth of the cvsimport
process.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 01:16:08 -07:00
Jeff King
e73aefe4fd cvsimport: cleanup commit function
This change attempts to clean up the commit function to make it a bit
easier to read (or at least the first half of it). It also improves
robustness and performance. Specifically:
  - report get_headref errors on opening ref unless the error is ENOENT
  - use regex to check for sha1 instead of length
  - use lexically scoped filehandles which get cleaned up automagically
  - check for error on both 'print' and 'close' (since output is buffered)
  - avoid "fork, do some perl, then exec" in commit(). It's not necessary,
    and we probably end up COW'ing parts of the perl process. Plus the code
    is much smaller because we can use open2()
  - avoid calling strftime over and over (mainly a readability cleanup)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 00:50:50 -07:00
Jeff King
6a1871e174 cvsimport: use git-update-index --index-info
This should reduce the number of git-update-index forks required per
commit. We now do adds/removes in one call, and we are no longer forced to
deal with argv limitations.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 00:41:39 -07:00
Linus Torvalds
4adcea995e cvsimport: repack every kilo-commits.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Acked-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 00:31:36 -07:00
Martin Langhoff
06918348de cvsimport: introduce -L<imit> option to workaround memory leaks
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 00:29:35 -07:00
Martin Langhoff
c4b16f8d77 cvsimport: replace anonymous sub ref with a normal sub
commit() does not need to be an anonymous subreference. Keep it simple.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-22 18:16:44 -07:00
Martin Langhoff
f396f01f11 cvsimport: minor fixups
Cleanup @skipped after it's used. Close a fhandle.
Removing suspects one at a time.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-22 18:12:27 -07:00
Elrond
be0c7e0697 git-cvsimport: Handle "Removed" from pserver
Sometimes the pserver says "Removed" instead of "Remove-entry".

Signed-off-by: Elrond <elrond+kernel.org@samba-tng.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-17 22:32:16 -07:00
Johannes Schindelin
42277bc81c cvsimport: use git-update-ref when updating
This simplifies code, and also fixes a subtle bug: when importing in a
shared repository, where another user last imported from CVS, cvsimport
used to complain that it could not open <branch> for update.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-04 17:13:25 -07:00
Junio C Hamano
cb9594e28c cvsimport: fix reading from rev-parse
The updated code reads the tip of the current branch before and
after the import runs, but forgot to chomp what we read from the
command.  The read-tree command did not them with the trailing
LF.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-18 02:05:02 -08:00
Junio C Hamano
8a5f2eac52 cvsimport: honor -i and non -i upon subsequent imports
Documentation says -i is "import only", so without it,
subsequent import should update the current branch and working
tree files in a sensible way.

"A sensible way" defined by this commit is "act as if it is a
git pull from foreign repository which happens to be CVS not
git".  So:

 - If importing into the current branch (note that cvsimport
   requires the tracking branch is pristine -- you checked out
   the tracking branch but it is your responsibility not to make
   your own commits there), fast forward the branch head and
   match the index and working tree using two-way merge, just
   like "git pull" does.

 - If importing into a separate tracking branch, update that
   branch head, and merge it into your current branch, again,
   just like "git pull" does.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-17 14:10:16 -08:00
Matthias Urlichs
a541211ef4 cvsimport: Remove master-updating code
The code which tried to update the master branch was somewhat broken.
=> People should do that manually, with "git merge".

Signed-off-by: Matthias Urlichs <smurf@smurf.noris.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-07 17:00:45 -08:00
Junio C Hamano
dd27478f09 cvsimport: avoid open "-|" list form for Perl 5.6
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-20 14:24:06 -08:00
Martin Mares
39ba7d5464 Fix retries in git-cvsimport
Fixed a couple of bugs in recovering from broken connections:

The _line() method now returns undef correctly when the connection
is broken instead of falling off the function and returning garbage.

Retries are now reported to stderr and the eventual partially
downloaded file is discarded instead of being appended to.

The "Server gone away" test has been removed, because it was
reachable only if the garbage return bug bit.

Signed-off-by: Martin Mares <mj@ucw.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-18 16:19:00 -08:00
Martin Langhoff
5179c8a54f cvsimport: Add -S <skipfileregex> support and -v announces files retrieved
A couple of things that seem to help importing broken CVS repos...

 -S '<slash-delimited-regex>' skips files with a matching path
 -v prints file name and version before fetching from cvs

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-29 23:25:38 -08:00
Junio C Hamano
8cd1621149 cvsimport: ease migration from CVSROOT/users format
This fixes a minor bug, which caused the author email to be
doubly enclosed in a <> pair (the code gave enclosing <> to
GIT_AUTHOR_EMAIL and GIT_COMMITTER_EMAIL environment variable).

The read_author_info() subroutine is taught to also understand
the user list in CVSROOT/users format.  This is primarily done
to ease migration for CVS users, who can use the -A option
to read from existing CVSROOT/users file.  write_author_info()
always writes in the git-cvsimport's native format ('='
delimited and value without quotes).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-15 21:13:22 -08:00
Andreas Ericsson
ffd97f3a35 git-cvsimport: Add -A <author-conv-file> option
This patch adds the option to specify an author name/email conversion
file in the format

	exon=Andreas Ericsson <ae@op5.se>
	spawn=Simon Pawn <spawn@frog-pond.org>

which will translate the ugly cvs authornames to the more informative
git style.

The info is saved in $GIT_DIR/cvs-authors, so that subsequent
incremental imports will use the same author-info even if no -A
option is specified. If an -A option *is* specified, the info in
$GIT_DIR/cvs-authors is appended/updated appropriately.

Docs updated accordingly.

Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-15 21:13:22 -08:00
Joe English
34c99da2a4 Substitute "/" with $opt_s in tag names as well as branch names
In 'git cvsimport' changes "/" to "-" (or $opt_s) in branch names,
but not in tag names, which is inconsistent.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-06 14:27:47 -08:00
Andreas Ericsson
2c52a42dd7 cvsimport: Don't let local settings hinder cvs user-migration.
Avoid this by passing "--norc" to cvsps.

Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-25 03:38:18 -08:00
Pavel Roskin
8366a10ab2 symref support for import scripts
Fix git import script not to assume that .git/HEAD is a symlink.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-16 13:19:18 -08:00
Martin Langhoff
9acb552d98 cvsimport: cvsps should be quiet too
Tell cvsps to be quiet, unless we've been told to be verbose.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-01 16:57:53 -08:00
Martin Langhoff
211dcac643 cvsimport: introduce -P <cvsps-output-file> option
-P:: <cvsps-output-file>
       Instead of calling cvsps, read the provided cvsps output file. Useful
       for debugging or when cvsps is being handled outside cvsimport.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-01 16:57:38 -08:00
Martin Langhoff
d44e8cf029 cvsimport: catch error condition where cvs host disappears
Add error handling for cases where the cvs server goes away unexpectedly.
While I don't know why the cvs server is so erratic, we should definitely
exit here before committing bogus files.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-01 16:57:14 -08:00
Johannes Schindelin
e175768954 Fix cvsimport warning when called without --no-cvs-direct
Perl was warning that $opt_p was undefined in that case.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 11:35:16 -07:00
Junio C Hamano
29504118f8 Merge branch 'svn' of http://netz.smurf.noris.de/git/git
[jc: I have my pre-commit hook enabled to catch trailing whitespaces,
 and fixed them up while merging.]

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-16 11:55:35 -07:00
Martin Langhoff
df73e9c62a [PATCH] cvsimport: don't pass --cvs-direct if user options contradict us
Detecting if the user passed --no-cvs-direct and don't force the mode.
It allows us to support all the protocol that the standard cvs client
supports at the snail speed you should expect.

This only affects the rlog reading stage.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
2005-10-11 21:57:04 -07:00
Matthias Urlichs
89764f5d8b cvsimport: report merge parents
Matching and reporting merge parents happens in a subprocess.
Re-open stdout before redirecting stdout to the pipe, so that printing
verbose messages doesn't go to the wrong place.

Signed-Off-By: Matthias Urlichs <smurf@smurf.noris.de>
2005-10-10 11:15:09 +02:00
Junio C Hamano
94c23343dc Pass CVSps generated A U Thor <author@domain.xz> intact.
Alexey Nezhdanov updated CVSps to generate author-name and
author-email information in its output.

If the input looks like it has that already properly formatted,
use that without our own munging.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-01 23:15:23 -07:00
Junio C Hamano
215a7ad1ef Big tool rename.
As promised, this is the "big tool rename" patch.  The primary differences
since 0.99.6 are:

  (1) git-*-script are no more.  The commands installed do not
      have any such suffix so users do not have to remember if
      something is implemented as a shell script or not.

  (2) Many command names with 'cache' in them are renamed with
      'index' if that is what they mean.

There are backward compatibility symblic links so that you and
Porcelains can keep using the old names, but the backward
compatibility support  is expected to be removed in the near
future.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-07 17:45:20 -07:00