* db/svn-fe-code-purge:
vcs-svn: drop obj_pool
vcs-svn: drop treap
vcs-svn: drop string_pool
vcs-svn: pass paths through to fast-import
Conflicts:
vcs-svn/fast_export.c
vcs-svn/fast_export.h
vcs-svn/repo_tree.c
vcs-svn/repo_tree.h
vcs-svn/string_pool.c
vcs-svn/svndump.c
vcs-svn/trp.txt
This teaches svn-fe to incrementally import into an existing
repository (at last!) at the expense of less convenient UI. Think of
it as growing pains. This opens the door to many excellent things,
and it would be a bad idea to discourage people from building on it
for much longer.
* db/vcs-svn-incremental:
vcs-svn: avoid using ls command twice
vcs-svn: use mark from previous import for parent commit
vcs-svn: handle filenames with dq correctly
vcs-svn: quote paths correctly for ls command
vcs-svn: eliminate repo_tree structure
vcs-svn: add a comment before each commit
vcs-svn: save marks for imported commits
vcs-svn: use higher mark numbers for blobs
vcs-svn: set up channel to read fast-import cat-blob response
Conflicts:
t/t9010-svn-fe.sh
vcs-svn/fast_export.c
vcs-svn/fast_export.h
vcs-svn/repo_tree.c
vcs-svn/svndump.c
Currently there are two functions to retrieve the mode and content
at a path:
const char *repo_read_path(const uint32_t *path);
uint32_t repo_read_mode(const uint32_t *path)
Replace them with a single function with two return values. This
means we can use one round-trip to get the same information from
fast-import that previously took two.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Pass the log message by strbuf instead of as a C-style string and use
fwrite instead of printf to write it to fast-import so embedded '\0'
bytes can be preserved.
Currently "git log" doesn't show the embedded NULs but "git cat-file
commit" can.
While at it, stop including system headers from repo_tree.h. git
source files need to include git-compat-util.h (or cache.h or
builtin.h) sooner to ensure the appropriate feature test macros are
defined.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Now that there is no internal representation of the repo, it is not
necessary to tokenise paths. Use strbuf instead and bypass
string_pool.
This means svn-fe can handle arbitrarily long paths (as long as a
strbuf can fit them), with arbitrarily many path components.
While at it, since we now treat paths in their entirety, only quote
when necessary.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* db/strbufs-for-metadata:
vcs-svn: use strbuf for author, UUID, and URL
vcs-svn: use strbuf for revision log
Conflicts:
vcs-svn/fast_export.c
vcs-svn/fast_export.h
vcs-svn/repo_tree.c
vcs-svn/svndump.c
Use strbufs and strings instead of interned strings for values of rev,
dump, and node fields that happen to be strings. After this change,
the only remaining string_pool use is for paths in the repo_tree API
and internals.
Functional change: treat an empty author, UUID, or URL as none at all.
So for example, in repos where the first revision has an empty
svn:author property, the first rev will be treated as by "nobody"
rather than by a person with empty name and email address created by
prepending an @ sign to the repository UUID.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Rely on fast-import for information about previous revs.
This requires always setting up backward flow of information, even for
v2 dumps. On the plus side, it simplifies the code by quite a bit and
opens the door to further simplifications.
[db: adjusted to support final version of the cat-blob patch]
[jn: avoiding hard-coding git's name for the empty tree for
portability to other backends]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Restrict the repo_tree API to functions that are actually needed.
- decouple reading the mode and content of dirents from other
operations.
- remove repo_modify_path. It is only used to read the mode from
dirents.
- remove the ability to use repo_read_mode on a missing path. The
existing code only errors out in that case, anyway.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
The repo_tree structure remembers, for each path in each revision, a
mode (regular file, executable, symlink, or directory) and content
(blob mark or directory structure). Maintaining a second copy of all
this information when it's already in the target repository is
wasteful, it does not persist between svn-fe invocations, and most
importantly, there is no convenient way to transfer it from one
machine to another. So it would be nice to get rid of it.
As a first step, let's change the repo_tree API to match fast-import's
read commands more closely. Currently to read the mode for a path,
one uses
repo_modify_path(path, new_mode, new_content);
which changes the mode and content as a side effect. There is no
function to read the content at a path; add one.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
There are two functions to change the staged content for a path in the
svn importer's active commit: repo_replace, which changes the text and
returns the mode, and repo_modify, which changes the text and mode and
returns nothing.
Worse, there are more subtle differences:
- A mark of 0 passed to repo_modify means "use the existing content".
repo_replace uses it as mark :0 and produces a corrupt stream.
- When passed a path that is not part of the active commit,
repo_replace returns without doing anything. repo_modify
transparently adds a new directory entry.
Get rid of both and introduce a new function with the best features of
both: repo_modify_path modifies the mode, content, or both for a path,
depending on which arguments are zero. If no such dirent already
exists, it does nothing and reports the error by returning 0.
Otherwise, the return value is the resulting mode.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
repo_tree maintains the exporter's state and provides a facility to to
call fast_export, which writes objects to stdout suitable for
consumption by fast-import.
The exported functions roughly correspond to Subversion FS operations.
. repo_add, repo_modify, repo_copy, repo_replace, and repo_delete
update the current commit, based roughly on the corresponding
Subversion FS operation.
. repo_commit calls out to fast_export to write the current commit to
the fast-import stream in stdout.
. repo_diff is used by the fast_export module to write the changes
for a commit.
. repo_reset erases the exporter's state, so valgrind can be happy.
[rr: squelched compiler warnings]
[jn: removed support for maintaining state on-disk, though we may
want to add it back later]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>