Add a 'done' command that causes fast-import to stop reading from the
stream and exit.
If the new --done command line flag was passed on the command line
(or a "feature done" declaration included at the start of the stream),
make the 'done' command mandatory. So "git fast-import --done"'s
input format will be prefix-free, making errors easier to detect when
they show up as early termination at some convenient time of the
upstream of a pipe writing to fast-import.
Another possible application of the 'done' command would to be allow a
fast-import stream that is only a small part of a larger encapsulating
stream to be easily parsed, leaving the file offset after the "done\n"
so the other application can pick up from there. This patch does not
teach fast-import to do that --- fast-import still uses buffered input
(stdio).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The SYNOPSIS sections of most commands that span several lines already
use [verse] to retain line breaks. Most commands that don't span
several lines seem not to use [verse]. In the HTML output, [verse]
does not only preserve line breaks, but also makes the section
indented, which causes a slight inconsistency between commands that
use [verse] and those that don't. Use [verse] in all SYNOPSIS sections
for consistency.
Also remove the blank lines from git-fetch.txt and git-rebase.txt to
align with the other man pages. In the case of git-rebase.txt, which
already uses [verse], the blank line makes the [verse] not apply to
the last line, so removing the blank line also makes the formatting
within the document more consistent.
While at it, add single quotes to 'git cvsimport' for consistency with
other commands.
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove spurious "=" after --relative-marks.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of these sections is generally to:
1. Give credit where it is due.
2. Give the reader an idea of where to ask questions or
file bug reports.
But they don't do a good job of either case. For (1), they
are out of date and incomplete. A much more accurate answer
can be gotten through shortlog or blame. For (2), the
correct contact point is generally git@vger, and even if you
wanted to cc the contact point, the out-of-date and
incomplete fields mean you're likely sending to somebody
useless.
So let's drop the fields entirely from all manpages except
git(1) itself. We already point people to the mailing list
for bug reports there, and we can update the Authors section
to give credit to the major contributors and point to
shortlog and blame for more information.
Each page has a "This is part of git" footer, so people can
follow that to the main git manpage.
Lazy fast-import frontend authors that want to rely on the backend to
keep track of the content of the imported trees _almost_ have what
they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28).
But it is not quite enough, since
(1) cat-blob can be used to retrieve the content of files, but
not their mode, and
(2) using cat-blob requires the frontend to keep track of a name
(mark number or object id) for each blob to be retrieved
Introduce an 'ls' command to complement cat-blob and take care of the
remaining needs. The 'ls' command finds what is at a given path
within a given tree-ish (tag, commit, or tree):
'ls' SP <dataref> SP <path> LF
or in fast-import's active commit:
'ls' SP <path> LF
The response is a single line sent through the cat-blob channel,
imitating ls-tree output. So for example:
FE> ls :1 Documentation
gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation
FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt
gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt
FE> ls :1 RelNotes
gfi> 120000 blob b942e49944 RelNotes
FE> cat-blob b942e49944
gfi> b942e49944 blob 32
gfi> Documentation/RelNotes/1.7.4.txt
The most interesting parts of the reply are the first word, which is
a 6-digit octal mode (regular file, executable, symlink, directory,
or submodule), and the part from the second space to the tab, which is
a <dataref> that can be used in later cat-blob, ls, and filemodify (M)
commands to refer to the content (blob, tree, or commit) at that path.
If there is nothing there, the response is "missing some/path".
The intent is for this command to be used to read files from the
active commit, so a frontend can apply patches to them, and to copy
files and directories from previous revisions.
For example, proposed updates to svn-fe use this command in place of
its internal representation of the repository directory structure.
This simplifies the frontend a great deal and means support for
resuming an import in a separate fast-import run (i.e., incremental
import) is basically free.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Sverre Rabbelier <srabbelier@gmail.com>
Here is a 'feature' command for streams to use to require support for
the notemodify (N) command.
When the 'feature' facility was introduced (v1.7.0-rc0~95^2~4,
2009-12-04), the notes import feature was old news (v1.6.6-rc0~21^2~8,
2009-10-09) and it was not obvious it deserved to be a named feature.
But now that is clear, since all major non-git fast-import backends
lack support for it.
Details: on git version with this patch applied, any "feature notes"
command in the features/options section at the beginning of a stream
will be treated as a no-op. On fast-import implementations without
the feature (and older git versions), the command instead errors out
with a message like
This version of fast-import does not support feature notes.
So by declaring use of notes at the beginning of a stream, frontends
can avoid wasting time and other resources when the backend does not
support notes. (This would be especially important for backends that
do not support rewinding history after a botched import.)
Improved-by: Thomas Rast <trast@student.ethz.ch>
Improved-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "feature" command allows streams to specify options for the import
that must not be ignored. Logically, they are part of the stream,
even though technically most supported features are synonyms to
command-line options.
Make this more obvious by being more explicit about how the analogy
between most "feature" commands and command-line options works. Treat
the feature (import-marks) that does not fit this analogy separately.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Omit needless words ("Additionally ... <path> may also" is redundant).
While at it, place the explanation of this special case after the
general rules for paths to provide the reader with some context.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a frontend uses a marks file to ensure its state persists between
runs, it may represent "clean slate" when bootstrapping with "no marks
yet". In such a case, feeding the last state with --import-marks and
saving the state after the current run with --export-marks would be a
natural thing to do.
The --import-marks option however errors out when the specified marks file
doesn't exist; this makes bootstrapping a bit difficult. The location of
the marks file becomes backend-dependent when --relative-marks is in
effect, and the frontend cannot check for the existence of the file in
such a case.
The --import-marks-if-exists option does the same thing as --import-marks
but does not flag an error if the named file does not exist yet to help
these frontends.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/fast-import-blob-access:
t9300: avoid short reads from dd
t9300: remove unnecessary use of /dev/stdin
fast-import: Allow cat-blob requests at arbitrary points in stream
fast-import: let importers retrieve blobs
fast-import: clarify documentation of "feature" command
fast-import: stricter parsing of integer options
Conflicts:
fast-import.c
The new rule: a "cat-blob" can be inserted wherever a comment is
allowed, which means at the start of any line except in the middle of
a "data" command.
This saves frontends from having to loop over everything they want to
commit in the next commit and cat-ing the necessary objects in
advance.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New objects written by fast-import are not available immediately.
Until a checkpoint has been started and finishes writing the pack
index, any new blobs will not be accessible using standard git tools.
So introduce a new way to access them: a "cat-blob" command in the
command stream requests for fast-import to print a blob to stdout or a
file descriptor specified by the argument to --cat-blob-fd. The value
for cat-blob-fd cannot be specified in the stream because that would
be a layering violation: the decision of where to direct a stream has
to be made when fast-import is started anyway, so we might as well
make the stream format is independent of that detail.
Output uses the same format as "git cat-file --batch".
Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing
the protocol.
Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "feature" command allows streams to specify options for the import
that must not be ignored. Logically, they are part of the stream,
even though technically most supported features are synonyms to
command-line options.
Make this more obvious by being more explicit about how the analogy
between most "feature" commands and command-line options works. Treat
the feature (import-marks) that does not fit this analogy separately.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be tedious to wait for a multi-million-revision import.
Unfortunately it is hard to spy on the import because fast-import
works by continuously streaming out objects, without updating the pack
index or refs until a checkpoint command or the end of the stream.
So allow the impatient operator to request checkpoints by sending a
signal, like so:
killall -USR1 git-fast-import
When receiving such a signal, fast-import would schedule a checkpoint
to take place after the current top-level command (usually a "commit"
or "blob" request) finishes.
Caveats: just like ordinary checkpoint commands, such requests slow
down the import. Switching to a new pack at a suboptimal moment is
also likely to result in a less dense initial collection of packs.
That's the price.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Better advice on using topic branches for kernel development
Documentation: update implicit "--no-index" behavior in "git diff"
Documentation: expand 'git diff' SEE ALSO section
Documentation: diff can compare blobs
Documentation: gitrevisions is in section 7
shell portability: no "export VAR=VAL"
CodingGuidelines: reword parameter expansion section
Documentation: update-index: -z applies also to --index-info
Documentation: No argument of ALLOC_GROW should have side-effects
Fix references to gitrevisions(1) in the manual pages and HTML
documentation.
In practice, this will not matter much unless someone tries to use a
hard copy of the git reference manual.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
v1.7.3-rc0~75^2 (Teach fast-import to import subtrees named by tree id,
2010-06-30) has a shortcoming - it doesn't allow the root to be set.
Extend this behaviour by allowing the root to be referenced as the
empty path, "".
For a command (like filter-branch --subdirectory-filter) that wants
to commit a lot of trees that already exist in the object db, writing
undeltified objects as loose files only to repack them later can
involve a significant amount of overhead.
(23% slow-down observed on Linux 2.6.35, worse on Mac OS X 10.6)
Fortunately we have fast-import (which is one of the only git commands
that will write to a pack directly) but there is not an advertised way
to tell fast-import to commit a given tree without unpacking it.
This patch changes that, by allowing
M 040000 <tree id> ""
as a filemodify line in a commit to reset to a particular tree without
any need to parse it. For example,
M 040000 4b825dc642 ""
is a synonym for the deleteall command and the fast-import equivalent of
git read-tree 4b825dc642
Signed-off-by: David Barr <david.barr@cordelta.com>
Commit-message-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Tested-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, whenever we need documentation for revisions and ranges, we
link to the git-rev-parse man page, i.e. a plumbing man page, which has
this along with the documentation of all rev-parse modes.
Link to the new gitrevisions man page instead in all cases except
- when the actual git-rev-parse command is referred to or
- in very technical context (git-send-pack).
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To simulate the svn cp command, it would be very useful to be
replace an arbitrary file in the current revision by an
arbitrary directory from a previous one. Modify the filemodify
command to allow that:
M 040000 <tree id> pathname
This would be most useful in combination with a facility to
print the commit ids for new revisions as they are written.
Cc: Shawn O. Pearce <spearce@spearce.org>
Cc: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that fast-import is creating packs with index version 2, there is
no point limiting the pack size by default. A pack split will still
happen if off_t is not sufficiently large to hold large offsets.
While updating the doc, let's remove the "packfiles fit on CDs"
suggestion. Pack files created by fast-import are still suboptimal and
a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty
good idea before considering storage on CDs.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git
pack-object' count in bytes, 2010-02-03) which made the option by the same
name to pack-objects, this counts the pack size limit in bytes.
In order not to cause havoc with people used to the previous megabyte
scale an integer smaller than 8192 is interpreted in megabytes but the
user gets a warning. Also a minimum size of 1 MiB is enforced to avoid an
explosion of pack files.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
* sp/maint-fast-import-large-blob:
fast-import: Stream very large blobs directly to pack
bash: don't offer remote transport helpers as subcommands
Conflicts:
fast-import.c
If a blob is larger than the configured big-file-threshold, instead
of reading it into a single buffer obtained from malloc, stream it
onto the end of the current pack file. Streaming the larger objects
into the pack avoids the 4+ GiB memory footprint that occurs when
fast-import is processing 2+ GiB blobs.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* remotes/trast-doc/for-next:
Documentation: spell 'git cmd' without dash throughout
Documentation: format full commands in typewriter font
Documentation: warn prominently against merging with dirty trees
Documentation/git-merge: reword references to "remote" and "pull"
Conflicts:
Documentation/config.txt
Documentation/git-config.txt
Documentation/git-merge.txt
* sr/gfi-options:
fast-import: add (non-)relative-marks feature
fast-import: allow for multiple --import-marks= arguments
fast-import: test the new option command
fast-import: add option command
fast-import: add feature command
fast-import: put marks reading in its own function
fast-import: put option parsing code in separate functions
The documentation was quite inconsistent when spelling 'git cmd' if it
only refers to the program, not to some specific invocation syntax:
both 'git-cmd' and 'git cmd' spellings exist.
The current trend goes towards dashless forms, and there is precedent
in 647ac70 (git-svn.txt: stop using dash-form of commands.,
2009-07-07) to actively eliminate the dashed variants.
Replace 'git-cmd' with 'git cmd' throughout, except where git-shell,
git-cvsserver, git-upload-pack, git-receive-pack, and
git-upload-archive are concerned, because those really live in the
$PATH.
The fast-import parser does not validate that the author, committer
or tagger name component contains both a name and an email address.
Therefore the name component has always been optional. Correct the
documentation to match the implementation.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After specifying 'feature relative-marks' the paths specified with
'feature import-marks' and 'feature export-marks' are relative to an
internal directory in the current repository.
In git-fast-import this means that the paths are relative to the
'.git/info/fast-import' directory. However, other importers may use a
different location.
Add 'feature non-relative-marks' to disable this behavior, this way
it is possible to, for example, specify the import-marks location as
relative, and the export-marks location as non-relative.
Also add tests to verify this behavior.
Cc: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --import-marks= option may be specified multiple times on the
commandline and should result in all marks being read in. Only one
import-marks feature may be specified in the stream, which is
overriden by any --import-marks= commandline options.
If one wishes to specify import-marks files in addition to the one
specified in the stream, it is easy to repeat the stream option as a
--import-marks= commandline option.
Also verify this behavior with tests.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the frontend to specify any of the supported options as
long as no non-option command has been given. This way the
user does not have to include any frontend-specific options, but
instead she can rely on the frontend to tell fast-import what it
needs.
Also factor out parsing of argv and have it execute when we reach the
first non-option command, or after all commands have been read and
no non-option command has been encountered.
Non-git options are ignored, unrecognised options result in an error.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the fronted to require a specific feature to be supported
by the backend, or abort.
Also add support for four initial feature, date-format=, force=,
import-marks=, export-marks=.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand
is similar to 'filemodify', except that no mode is supplied (all notes have
mode 0644), and the path is set to the hex SHA1 of the given "comittish".
This enables fast import of note objects along with their associated commits,
since the notes can now be named using the mark references of their
corresponding commits.
The patch also includes a test case of the added functionality.
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently fast-import/export cannot be used for
repositories with submodules. This patch extends
the relevant programs to make them correctly
process gitlinks.
Links can be represented by two forms of the
Modify command:
M 160000 SHA1 some/path
which sets the link target explicitly, or
M 160000 :mark some/path
where the mark refers to a commit. The latter
form can be used by importing tools to build
all submodules simultaneously in one physical
repository, and then simply fetch them apart.
Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The names of git commands are not meant to be entered at the
commandline; they are just names. So we render them in italics,
as is usual for command names in manpages.
Using
doit () {
perl -e 'for (<>) { s/\`(git-[^\`.]*)\`/'\''\1'\''/g; print }'
}
for i in git*.txt config.txt diff*.txt blame*.txt fetch*.txt i18n.txt \
merge*.txt pretty*.txt pull*.txt rev*.txt urls*.txt
do
doit <"$i" >"$i+" && mv "$i+" "$i"
done
git diff
.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following what appears to be the predominant style, format
names of commands and commandlines both as `teletype text`.
While we're at it, add articles ("a" and "the") in some
places, italicize the name of the command in the manual page
synopsis line, and add a comma or two where it seems appropriate.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the git-* commands are not installed in $(bindir), using
"git-command <parameters>" in examples in the documentation is
not a good idea. On the other hand, it is nice to be able to
refer to each command using one hyphenated word. (There is no
escaping it, anyway: man page names cannot have spaces in them.)
This patch retains the dash in naming an operation, command,
program, process, or action. Complete command lines that can
be entered at a shell (i.e., without options omitted) are
made to use the dashless form.
The changes consist only of replacing some spaces with hyphens
and vice versa. After a "s/ /-/g", the unpatched and patched
versions are identical.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the "git" man page describes the "git" command at the end-user
level, it seems better to move it to man section 1.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fast-import documentation currently does not document the behaviour
of "merge" when there is no "from" in a commit. This patch adds a
description of what happens: the commit is created with a parent, but
no files. This behaviour is equivalent to "from" followed by
"filedeleteall".
Signed-off-by: Eyvind Bernhardsen <eyvind-git@orakel.ntnu.no>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent versions of fast-import will now dump information out upon
crashing, making it possible for the frontend developer to review
some state information and possibly restart the import from the
point where it crashed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Between AsciiDoc 8.2.2 and 8.2.3, the following change was made to the stock
Asciidoc configuration:
@@ -149,7 +153,10 @@
# Inline macros.
# Backslash prefix required for escape processing.
# (?s) re flag for line spanning.
-(?su)[\\]?(?P<name>\w(\w|-)*?):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
+# Explicit so they can be nested.
+(?su)[\\]?(?P<name>(http|https|ftp|file|mailto|callto|image|link)):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
# Anchor: [[[id]]]. Bibliographic anchor.
(?su)[\\]?\[\[\[(?P<attrlist>[\w][\w-]*?)\]\]\]=anchor3
# Anchor: [[id,xreflabel]]
This default regex now matches explicit values, and unfortunately in this
case gitlink was being matched by just 'link', causing the wrong inline
macro template to be applied. By renaming the macro, we can avoid being
matched by the wrong regex.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing checkpoint command is very useful to force fast-import
to dump the branches out to disk so that standard Git tools can
access them and the objects they refer to. However there was not a
way to know when fast-import had finished executing the checkpoint
and it was safe to read those refs.
The progress command can be used to make fast-import output any
message of the frontend's choosing to standard out. The frontend
can scan for these messages using select() or poll() to monitor a
pipe connected to the standard output of fast-import.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
For the same reasons as the prior change we want to allow frontends
to omit the trailing LF that usually delimits commands. In some
cases these just make the input stream more verbose looking than
it needs to be, and its just simpler for the frontend developer to
get started if our parser is slightly more lenient about where an
LF is required and where it isn't.
To make this optional LF feature work we now have to buffer up to one
line of input in command_buf. This buffering can happen if we look
at the current input command but don't recognize it at this point
in the code. In such a case we need to "unget" the entire line,
but we cannot depend upon the stdio library to let us do ungetc()
for that many characters at once.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A few fast-import frontend developers have found it odd that we
require the LF following a `data` command, especially in the exact
byte count format. Technically we don't need this LF to parse
the stream properly, but having it here does make the stream more
readable to humans. We can easily make the LF optional by peeking
at the next byte available from the stream and pushing it back into
the buffer if its not LF.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Several frontend developers have asked that some form of stream
comments be permitted within a fast-import data stream. This way
they can include information from their own frontend program about
where specific data was taken from in the source system, or about
a decision that their frontend may have made while creating the
fast-import data stream.
This change introduces comments in the Bourne-shell/Tcl/Perl style.
Lines starting with '#' are ignored, up to and including the LF.
Unlike the above mentioned three languages however we do not look for
and ignore leading whitespace. This just simplifies the definition
of the comment format and the code that parses them.
To make comments work we had to stop using read_next_command() within
cmd_data() and directly invoke read_line() during the inline variant
of the function. This is necessary to retain any lines of the
input data that might otherwise look like a comment to fast-import.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some source material (e.g. Subversion dump files) perform directory
renames by telling us the directory was copied, then deleted in the
same revision. This makes it difficult for a frontend to convert
such data formats to a fast-import stream, as all the frontend has
on hand is "Copy a/ to b/; Delete a/" with no details about what
files are in a/, unless the frontend also kept track of all files.
The new 'C' subcommand within a commit allows the frontend to make a
recursive copy of one path to another path within the branch, without
needing to keep track of the individual file paths. The metadata
copy is performed in memory efficiently, but is implemented as a
copy-immediately operation, rather than copy-on-write.
With this new 'C' subcommand frontends could obviously implement an
'R' (rename) on their own as a combination of 'C' and 'D' (delete),
but since we have already offered up 'R' in the past and it is a
trivial thing to keep implemented I'm not going to deprecate it.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some source material (e.g. Subversion dump files) perform directory
renames without telling us exactly which files in that subdirectory
were moved. This makes it hard for a frontend to convert such data
formats to a fast-import stream, as all the frontend has on hand
is "Rename a/ to b/" with no details about what files are in a/,
unless the frontend also kept track of all files.
The new 'R' subcommand within a commit allows the frontend to
rename either a file or an entire subdirectory, without needing to
know the object's SHA-1 or the specific files contained within it.
The rename is performed as efficiently as possible internally,
making it cheaper than a 'D'/'M' pair for a file rename.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The 'D' subcommand within a commit can also delete a directory
recursively. This wasn't clear in the prior version of the
documentation, leading to a question on the mailing list.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
git.el: Retrieve commit log information from .dotest directory.
git.el: Avoid appending a signoff line that is already present.
setup_git_directory_gently: fix off-by-one error
user-manual: install user manual stylesheet with other web documents
user-manual: fix rendering of history diagrams
user-manual: fix missing colon in git-show example
user-manual: fix inconsistent use of pull and merge
user-manual: fix inconsistent example
glossary: fix overoptimistic automatic linking of defined terms
Documentation: s/seperator/separator/
Adjust reflog filemode in shared repository
I'm giving fast-import a lesson on how to reload the marks table
using the same format it outputs with --export-marks. This way
a frontend can reload the marks table from a prior import, making
incremental imports less painful.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
It was suggested on the mailing list that being able to use `from`
in any commit to reset the current branch is useful in some types of
importers, such as a darcs importer.
We originally did not permit resetting an existing branch with a
new `from` command during a `commit` command, but this restriction
was only to help debug the hacked up cvs2svn that Jon Smirl was
developing in parallel with git-fast-import. It is probably more
of a problem to disallow it than to allow it. So now we permit a
`from` during any `commit`.
While making the changes required to permit multiple `from`
commands on the same branch, I discovered we no longer needed the
last_commit field to be set to 0 during a reset, so that was removed.
(Reset was originally setting the field to 0 to signal cmd_from()
that it was OK to execute on the branch.)
While poking around in this section of fast-import I also realized
the `reset` command was not working as intended if the corresponding
`from` command was omitted (as allowed by the BNF grammar and the
code). If `from` was omitted we cleared out the tree but we left
the tree SHA-1 and parent commit SHA-1 intact. This is not what
the user intended in this case. Instead they would be trying to
reset the branch to have no parent and to have no tree, making the
branch look new-born during the next commit. We now clear these
SHA-1 values during `reset`, ensuring the branch looks new-born if
`from` does not get supplied.
New test cases for these were also added.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Most users don't need the pack boundary information that fast-import
was printing to standard output, especially if they were calling
it with --quiet.
Those users who do want this information probably want it captured
so they can go back and use it to repack the imported repository.
So dumping the boundary commits to a log file makes more sense then
printing them to standard output.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Its spelled 'fast-import', not 'gfi'. Linus and Dscho have both
recently pointed this out to me on the mailing list.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Minor documentation improvements, as suggested on the Git mailing
list by Horst H. von Brand and Karl Hasselström.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
I wrote this documentation with asciidoc 7.1.2, but apparently
asciidoc 8 assumes ^ means superscript. The solution was already
documented in rev-parse's manpage and is to use {caret} instead.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
There has been some informative lessons learned in the gfi user
community, and these really should be written down and documented
for future generations of frontend developers.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the frontend asks us to checkpoint (via the explicit checkpoint
command) its probably because they are afraid the current import
will crash/fail/whatever and want to make sure they can pickup from
the last checkpoint. To do that sort of recovery, we will need the
current tip of every branch and tag available at the next startup.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Often users will be running fast-import from within a larger frontend
process, and this may be a frequent periodic tool such as a future
edition of `git-svn fetch`. We don't want to bombard users with our
large stats output if they won't be interested in it, so `--quiet`
is now an option to make gfi be more silent.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some frontends may not be able to (easily) keep track of which files
are included in the branch, and which aren't. Performing this
tracking can be tedious and error prone for the frontend to do,
especially if its foreign data source cannot supply the changed
path list on a per-commit basis.
fast-import now allows a frontend to request that a branch's tree
be wiped clean (reset to the empty tree) at the start of a commit,
allowing the frontend to feed in all paths which belong on the branch.
This is ideal for a tar-file importer frontend, for example, as
the frontend just needs to reformat the tar data stream into a gfi
data stream, which may be something a few Perl regexps can take
care of. :)
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
As discussed on the mailing list, the documentation used here was
not quite accurate. Improve upon it.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If fast-import is being used to update an existing branch of
a repository, the user may not want to lose commits if another
process updates the same ref at the same time. For example, the
user might be using fast-import to make just one or two commits
against a live branch.
We now perform a fast-forward check during the ref updating process.
If updating a branch would cause commits in that branch to be lost,
we skip over it and display the new SHA1 to standard error.
This new default behavior can be overridden with `--force`, like
git-push and git-fetch.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Since some frontends may be working with source material where
the dates are only readily available as RFC 2822 strings, it is
more friendly if fast-import exposes Git's parse_date() function
to handle the conversion. This way the frontend doesn't need
to perform the parsing itself.
The new --date-format option to fast-import can be used by a
frontend to select which format it will supply date strings in.
The default is the standard `raw` Git format, which fast-import
has always supported. Format rfc2822 can be used to activate the
parse_date() function instead.
Because fast-import could also be useful for creating new, current
commits, the format `now` is also supported to generate the current
system timestamp. The implementation of `now` is a trivial call
to datestamp(), but is actually a whole whopping 3 lines so that
fast-import can verify the frontend really meant `now`.
As part of this change I have added validation of the `raw` date
format. Prior to this change fast-import would accept anything
in a `committer` command, even if it was seriously malformed.
Now fast-import requires the '> ' near the end of the string and
verifies the timestamp is formatted properly.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Corrected a couple of header markup lines which were shorter than the
actual header, and made the `data` commands two formats into a named
list, which matches how we document the two formats of the `M` command
within a commit.
Also tried to simplify the language about our decimal integer format;
Linus pointed out I was probably being too specific at the cost of
reduced readability.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Andy Parkins and Linus Torvalds both noticed that the description
of the timezone was incorrect. Its not expressed in minutes.
Its more like "hhmm", where "hh" is the number of hours and "mm"
is the number of minutes shifted from GMT/UTC.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The --branch-log option and its associated code hasn't been used in
several months, as its not really very useful for debugging fast-import
or a frontend. I don't plan on supporting it in this state long-term,
so I'm killing it now before it gets distributed to a wider audience.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is a first pass at the manpage for git-fast-import.
I have tried to cover the input format in extreme detail, creating a
reference which is more detailed than the BNF grammar appearing in
the header of fast-import.c. I have also covered some details about
gfi's performance and memory utilization, as well as the average
learning curve required to create a gfi frontend application (as it
is far lower than it might appear on first glance).
The documentation still lacks real example input streams, which may
turn out to be difficult to format in asciidoc due to the blank lines
which carry meaning within the format.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>