Some source material (e.g. Subversion dump files) perform directory
renames without telling us exactly which files in that subdirectory
were moved. This makes it hard for a frontend to convert such data
formats to a fast-import stream, as all the frontend has on hand
is "Rename a/ to b/" with no details about what files are in a/,
unless the frontend also kept track of all files.
The new 'R' subcommand within a commit allows the frontend to
rename either a file or an entire subdirectory, without needing to
know the object's SHA-1 or the specific files contained within it.
The rename is performed as efficiently as possible internally,
making it cheaper than a 'D'/'M' pair for a file rename.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When e8438420bb allowed us to reload
the marks table on subsequent runs of fast-import we really broke
things, as we set pack_id to MAX_PACK_ID for any objects we imported
into the marks table. Creating a branch from that mark should fail
as we attempt to read the object through a non-existant packed_git
pointer. Instead we have to use the normal Git object system to
locate the older commit, as we ourselves do not have a reference
to the packed_git it resides in.
This bug only occurred because t9300 was not complete enough.
When we added the --import-marks feature we didn't actually test
its implementation enough to verify the function worked as intended.
I have corrected that, and included the changes as part of this fix.
Prior versions of fast-import fail the new test(s); this commit
allows them to pass.
Credit for this bug find goes to Simon Hausmann <simon@lst.de> as
he recently identified a similiar bug in the tree lazy-loading path.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The Git tree sorting convention is more complex than just the name,
it needs to include the mode too to make sure trees sort as though
their name ends with "/".
This is a simple test case that verifies fast-import keeps the tree
ordering correct after editing the same tree twice in a single
input stream. A recent proposed patch series (that has not yet
been applied) will cause this test to fail, due to a bug in the
way the series handles sorting within the trees.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* js/diff-ni:
Get rid of the dependency to GNU diff in the tests
diff --no-index: support /dev/null as filename
diff-ni: fix the diff with standard input
diff: support reading a file from stdin via "-"
I'm giving fast-import a lesson on how to reload the marks table
using the same format it outputs with --export-marks. This way
a frontend can reload the marks table from a prior import, making
incremental imports less painful.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Now that "git diff" handles stdin and relative paths outside the
working tree correctly, we can convert all instances of "diff -u"
to "git diff".
This commit is really the result of
$ perl -pi.bak -e 's/diff -u/git diff/' $(git grep -l "diff -u" t/)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from commit c699a40d68215c7e44a5b26117a35c8a56fbd387)
It was suggested on the mailing list that being able to use `from`
in any commit to reset the current branch is useful in some types of
importers, such as a darcs importer.
We originally did not permit resetting an existing branch with a
new `from` command during a `commit` command, but this restriction
was only to help debug the hacked up cvs2svn that Jon Smirl was
developing in parallel with git-fast-import. It is probably more
of a problem to disallow it than to allow it. So now we permit a
`from` during any `commit`.
While making the changes required to permit multiple `from`
commands on the same branch, I discovered we no longer needed the
last_commit field to be set to 0 during a reset, so that was removed.
(Reset was originally setting the field to 0 to signal cmd_from()
that it was OK to execute on the branch.)
While poking around in this section of fast-import I also realized
the `reset` command was not working as intended if the corresponding
`from` command was omitted (as allowed by the BNF grammar and the
code). If `from` was omitted we cleared out the tree but we left
the tree SHA-1 and parent commit SHA-1 intact. This is not what
the user intended in this case. Instead they would be trying to
reset the branch to have no parent and to have no tree, making the
branch look new-born during the next commit. We now clear these
SHA-1 values during `reset`, ensuring the branch looks new-born if
`from` does not get supplied.
New test cases for these were also added.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Most users don't need the pack boundary information that fast-import
was printing to standard output, especially if they were calling
it with --quiet.
Those users who do want this information probably want it captured
so they can go back and use it to repack the imported repository.
So dumping the boundary commits to a log file makes more sense then
printing them to standard output.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some frontends may not be able to (easily) keep track of which files
are included in the branch, and which aren't. Performing this
tracking can be tedious and error prone for the frontend to do,
especially if its foreign data source cannot supply the changed
path list on a per-commit basis.
fast-import now allows a frontend to request that a branch's tree
be wiped clean (reset to the empty tree) at the start of a commit,
allowing the frontend to feed in all paths which belong on the branch.
This is ideal for a tar-file importer frontend, for example, as
the frontend just needs to reformat the tar data stream into a gfi
data stream, which may be something a few Perl regexps can take
care of. :)
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If fast-import is being used to update an existing branch of
a repository, the user may not want to lose commits if another
process updates the same ref at the same time. For example, the
user might be using fast-import to make just one or two commits
against a live branch.
We now perform a fast-forward check during the ref updating process.
If updating a branch would cause commits in that branch to be lost,
we skip over it and display the new SHA1 to standard error.
This new default behavior can be overridden with `--force`, like
git-push and git-fetch.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Since some frontends may be working with source material where
the dates are only readily available as RFC 2822 strings, it is
more friendly if fast-import exposes Git's parse_date() function
to handle the conversion. This way the frontend doesn't need
to perform the parsing itself.
The new --date-format option to fast-import can be used by a
frontend to select which format it will supply date strings in.
The default is the standard `raw` Git format, which fast-import
has always supported. Format rfc2822 can be used to activate the
parse_date() function instead.
Because fast-import could also be useful for creating new, current
commits, the format `now` is also supported to generate the current
system timestamp. The implementation of `now` is a trivial call
to datestamp(), but is actually a whole whopping 3 lines so that
fast-import can verify the frontend really meant `now`.
As part of this change I have added validation of the `raw` date
format. Prior to this change fast-import would accept anything
in a `committer` command, even if it was seriously malformed.
Now fast-import requires the '> ' near the end of the string and
verifies the timestamp is formatted properly.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Its very annoying to need to specify the file content ahead of a
commit and use marks to connect the individual blobs to the commit's
file modification entry, especially if the frontend can't/won't
generate the blob SHA1s itself. Instead it would much easier to
use if we can accept the blob data at the same time as we receive
each file_change line.
Now fast-import accepts 'inline' instead of a mark idnum or blob
SHA1 within the 'M' type file_change command. If an inline is
detected the very next line must be a 'data n' command, supplying
the file data.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
It is error prone to list the value of each file twice, instead we
should list the value only once early in the script and reuse the
shell variable when we need to access it.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Now that its easier to craft test cases (thanks to 'data <<')
we should start to verify fast-import works as expected.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>