Commit Graph

67 Commits

Author SHA1 Message Date
Jonathan Nieder
2a48afe1c2 vcs-svn: Split off function for handling of individual properties
The handle_property function is the part of read_props that would be
interesting for most people: semantics of properties rather than the
algorithm for parsing them.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:53:58 -08:00
Jonathan Nieder
3f3e676d6e vcs-svn: Make source easier to read on small screens
Remove some newlines from handle_node() that are not needed for
clarity.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:53:58 -08:00
Jonathan Nieder
c7dbf35e91 vcs-svn: More dump format sanity checks
Node-action: change is not appropriate when switching between file and
directory or adding a new file.  Current svn-fe silently accepts such
nodes and the resulting tree has missing files in the "changed when
meant to add" case.

Node-action: add requires some content (text or directory); there is
no such thing as an "intent to add" node in svn dumps.  Current svn-fe
accepts such contentless adds but produces an invalid fast-import
stream that refers to nonexistent mark :0 in response.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:52:51 -08:00
Jonathan Nieder
414e569e45 vcs-svn: Reject path nodes without Node-action
It would be better to flag such errors and let the import proceed
anyway, but for now it is simpler not to worry about recovery
from such weird cases.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:52:47 -08:00
Jonathan Nieder
1c7bb31616 vcs-svn: Delay read of per-path properties
The mode for each file in an svn-format dump is kept in the properties
section.  The properties section is read as soon as possible to allow
the correct mode to be filled in when registering the file with the
repo_tree lib.

To support nodes with a missing properties section, svn-fe determines
the mode in three stages:

 - The kind (directory or file) of the node is read from the dump and
   used to make an initial estimate (040000 or 100644).
 - Properties are read in and allowed to override this for symlinks
   and executables.
 - If there is no properties section, the mode from the previous
   content of the path is left alone, overriding the above
   considerations.

This is a bit of a mess, and worse, it would get even more complicated
once we start to support property deltas.  If we could only register
the file with a provisional value for mode and then change it later
when properties say so, the procedure would be much simpler.

... oh, right, we can.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:43 -08:00
Jonathan Nieder
08c39b5c44 vcs-svn: Combine repo_replace and repo_modify functions
There are two functions to change the staged content for a path in the
svn importer's active commit: repo_replace, which changes the text and
returns the mode, and repo_modify, which changes the text and mode and
returns nothing.

Worse, there are more subtle differences:

 - A mark of 0 passed to repo_modify means "use the existing content".
   repo_replace uses it as mark :0 and produces a corrupt stream.

 - When passed a path that is not part of the active commit,
   repo_replace returns without doing anything.  repo_modify
   transparently adds a new directory entry.

Get rid of both and introduce a new function with the best features of
both: repo_modify_path modifies the mode, content, or both for a path,
depending on which arguments are zero.  If no such dirent already
exists, it does nothing and reports the error by returning 0.
Otherwise, the return value is the resulting mode.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:43 -08:00
Jonathan Nieder
6ee4a9be48 vcs-svn: Replace = Delete + Add
Simplify by reducing the "Node-action: replace" case to "Node-action:
add".  This way, the main part of handle_node() only has to deal with
"add" and "change" nodes.

Functional change: replacing a symlink or executable without setting
properties will reset the mode.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:43 -08:00
Jonathan Nieder
5af8fae2df vcs-svn: handle_node: Handle deletion case early
Take care of "Node-action: delete" as soon as possible, so we can stop
worrying about that case in the rest of the function.

Functional change: catch deletion nodes with features that would not
apply to them (text, properties, or origin data) and error out for
those cases.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:43 -08:00
Jonathan Nieder
462e1f51a5 vcs-svn: Use mark to indicate nodes with included text
Allocate a mark if needed as soon as possible so later code can use
"if (mark)" to check if this node has text attached rather than
explicitly checking for Text-content-length.

While at it, reject directory nodes with text attached; the presence
of such a node would indicate a bug in the dump generator or svn-fe's
understanding.  In the long term, it would be nice to be able to
continue parsing and save the error for later, but for now it is
simpler to error out right away.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:43 -08:00
Jonathan Nieder
d6e81a0315 vcs-svn: Unclutter handle_node by introducing have_props var
It is possible for a path node in an SVN-format dump file to leave out
the properties section.  svn-fe handles this by carrying over the
properties (in particular, file type) from the old version of that
node.

To support this, handle_node tests several times whether a
Prop-content-length field is present.  Ancient Subversion actually
leaves out the Prop-content-length field even for nodes with
properties, so that's not quite the right check.  Besides, this detail
of mechanism is distracting when the question at hand is instead what
content the new node should have.

So introduce a local have_props variable.  The semantics are the same
as before; the adaptations to support ancient streams that leave out
the prop-content-length can wait until someone needs them.

Signed-off-by: Jonathan Nieder <jrnieer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:42 -08:00
Jonathan Nieder
da3e217447 vcs-svn: Eliminate node_ctx.mark global
The mark variable is only used in handle_node().  Its life is
very short and simple: first, a new mark number is allocated if
this node has text attached, then that mark is recorded in the
in-core tree being built up, and lastly the mark is communicated
to fast-import in the stream along with the associated text.

A new reader may worry about interaction with other code, especially
since mark is not initialized to zero in handle_node() itself.
Disperse such worries by making it local.  No functional change
intended.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:42 -08:00
Jonathan Nieder
1d13e9f600 vcs-svn: Eliminate node_ctx.srcRev global
The srcRev variable is only used in handle_node(); its purpose
is to hold the old mode for a path, to only be used if properties
are not being changed.  Narrow its scope to make its meaningful
lifetime more obvious.

No functional change intended.  Add some tests as a sanity-check
for the simplest case (no renames).

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:42 -08:00
Jonathan Nieder
5c28a8b054 vcs-svn: Check for errors from open()
test-svn-fe segfaults when passed a bogus path.  Simplify debugging by
exiting with a meaningful error message instead.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:51:42 -08:00
David Barr
1f05d07c45 vcs-svn: Allow simple v3 dumps (no deltas yet)
Since the dumpfile version 1 days, the Subversion dump format
gained some new fields:

 - a unique identifier for the repository (version 2 format)
 - whether the text and properties for a node should be
   interpreted as deltas
 - checksums for a delta's preimage
 - SHA-1 sums as alternatives to the existing MD5 checksums for
   copy source and the payload (delta).

For now what is relevant to us is the Text-delta and Prop-delta
fields, since not noticing these causes a dump file to be
misinterpreted (see the previous commit).

[jn: with tests]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:48:54 -08:00
Jonathan Nieder
b3e5bce1aa vcs-svn: Error out for v3 dumps
By ignoring the Text-Delta and Prop-Delta node fields, current svn-fe
happily mistakes deltas for full text and instead of cleanly erroring
out, it produces a valid but semantically bogus fast-import stream
when fed a dump file in the modern "svnadmin dump --deltas" format.

Dump file parsers are supposed to ignore header fields they don't
understand (to allow for backward-compatible extensions), but they are
also supposed to check the SVN-fs-dump-format-version header to
prevent misinterpretation of non backward-compatible extensions.
Do so.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24 14:48:52 -08:00
Ramsay Jones
5418d96ddc vcs-svn: Fix some printf format compiler warnings
In particular, on systems that define uint32_t as an unsigned long,
gcc complains as follows:

      CC vcs-svn/fast_export.o
  vcs-svn/fast_export.c: In function `fast_export_modify':
  vcs-svn/fast_export.c:28: warning: unsigned int format, uint32_t arg (arg 2)
  vcs-svn/fast_export.c:28: warning: int format, uint32_t arg (arg 3)
  vcs-svn/fast_export.c: In function `fast_export_commit':
  vcs-svn/fast_export.c:42: warning: int format, uint32_t arg (arg 5)
  vcs-svn/fast_export.c:62: warning: int format, uint32_t arg (arg 2)
  vcs-svn/fast_export.c: In function `fast_export_blob':
  vcs-svn/fast_export.c:72: warning: int format, uint32_t arg (arg 2)
  vcs-svn/fast_export.c:72: warning: int format, uint32_t arg (arg 3)
      CC vcs-svn/svndump.o
  vcs-svn/svndump.c: In function `svndump_read':
  vcs-svn/svndump.c:260: warning: int format, uint32_t arg (arg 3)

In order to suppress the warnings we use the C99 format specifier
macros PRIo32 and PRIu32 from <inttypes.h>.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-12 10:24:55 -07:00
David Barr
21746aa34f SVN dump parser
svndump parses data that is in SVN dumpfile format produced by
`svnadmin dump` with the help of line_buffer and uses repo_tree and
fast_export to emit a git fast-import stream.

Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase
project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and
others.

[rr: allow input from files other than stdin]
[jn: with test, more error reporting]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-14 19:35:38 -07:00