It is unfortunate to have to issue thousands of one-byte read calls to
work around dd's refusal to buffer input that would fill a block after
a short read (a3a6f4, 2010-12-13). We could do better by using
"head -c", if it were available on all platforms we cared about.
Replace it with some simple perl.
While doing so, restructure 9300.114 to use a subshell instead of a
script. Subshells can inherit functions (like the new head_c) from
the parent shell while external scripts cannot.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/fast-import-blob-access:
t9300: avoid short reads from dd
t9300: remove unnecessary use of /dev/stdin
fast-import: Allow cat-blob requests at arbitrary points in stream
fast-import: let importers retrieve blobs
fast-import: clarify documentation of "feature" command
fast-import: stricter parsing of integer options
Conflicts:
fast-import.c
dd is a thin wrapper around read(2). As open group Issue 7 explains:
It shall read the input one block at a time, using the specified
input block size; it shall then process the block of data
actually returned, which could be smaller than the requested
block size.
Any short read --- for example from a pipe whose capacity cannot fill
a block --- results in that block being truncated. As a result, the
first cat-blob test (9300.114) fails on Mac OS X, where the pipe
capacity is around 8 KiB.
Fix the test by using a block size of 1. Each read will block until
the next byte of input is available.
It would be even nicer to use head -c which expresses the intention
more clearly. Alas, IRIX "head" does not support the -c option.
Reported-by: Brian Gernhardt <brian@gernhardtsoftware.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We really shouldn't be using these funny /dev/* files that did not exist
in V7 UNIX in our tests when we do not have to.
Output from
$ git grep -n -e /dev/ --and --not -e /dev/null t/
tells us that, aside from use of /dev/urandom in apache.conf used in http
tests, "dd if=/dev/stdin" added recently to t/t9300-fast-import.sh are the
only offenders, and "dd" reads from the standard input by default, so
removing them should be straightforward.
Reported-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new rule: a "cat-blob" can be inserted wherever a comment is
allowed, which means at the start of any line except in the middle of
a "data" command.
This saves frontends from having to loop over everything they want to
commit in the next commit and cat-ing the necessary objects in
advance.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New objects written by fast-import are not available immediately.
Until a checkpoint has been started and finishes writing the pack
index, any new blobs will not be accessible using standard git tools.
So introduce a new way to access them: a "cat-blob" command in the
command stream requests for fast-import to print a blob to stdout or a
file descriptor specified by the argument to --cat-blob-fd. The value
for cat-blob-fd cannot be specified in the stream because that would
be a layering violation: the decision of where to direct a stream has
to be made when fast-import is started anyway, so we might as well
make the stream format is independent of that detail.
Output uses the same format as "git cat-file --batch".
Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing
the protocol.
Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check the result from strtoul to avoid accepting arguments like
--depth=-1 and --active-branches=foo,bar,baz.
Requested-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/fast-import-fix:
fast-import: do not clear notes in do_change_note_fanout()
t9300 (fast-import): another test for the "replace root" feature
fast-import: tighten M 040000 syntax
fast-import: filemodify after M 040000 <tree> "" crashes
Breaks in a test assertion's && chain can potentially hide
failures from earlier commands in the chain.
Commands intended to fail should be marked with !, test_must_fail, or
test_might_fail. The examples in this patch do not require that.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another test for the replace root feature. One can imagine an
implementation for which R "some/subdir" "" would free some state
associated to the subdir and leave fast-import confused.
Luckily, git's is not such an implementation.
While at it, change the previous test to use C "some/subdir" ""
instead of R (i.e., test both syntaxes).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When tree_content_set() is asked to modify the path "foo/bar/",
it first recurses like so:
tree_content_set(root, "foo/bar/", sha1, S_IFDIR) ->
tree_content_set(root:foo, "bar/", ...) ->
tree_content_set(root:foo/bar, "", ...)
And as a side-effect of 2794ad5 (fast-import: Allow filemodify to set
the root, 2010-10-10), this last call is accepted and changes
the tree entry for root:foo/bar to refer to the specified tree.
That seems safe enough but let's reject the new syntax (we never meant
to support it) and make it harder for frontends to introduce pointless
incompatibilities with git fast-import 1.7.3.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Until M 040000 <tree> "" syntax was introduced in commit 2794ad5
(fast-import: Allow filemodify to set the root, 2010-10-10), it
was impossible for the root entry to refer to an unloaded tree.
Update various functions to take that possibility into account.
Otherwise
M 040000 <tree> ""
M 100644 :1 "foo"
and similar commands (using D, C, or R after resetting the root
tree) segfault.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
v1.7.3-rc0~75^2 (Teach fast-import to import subtrees named by tree id,
2010-06-30) has a shortcoming - it doesn't allow the root to be set.
Extend this behaviour by allowing the root to be referenced as the
empty path, "".
For a command (like filter-branch --subdirectory-filter) that wants
to commit a lot of trees that already exist in the object db, writing
undeltified objects as loose files only to repack them later can
involve a significant amount of overhead.
(23% slow-down observed on Linux 2.6.35, worse on Mac OS X 10.6)
Fortunately we have fast-import (which is one of the only git commands
that will write to a pack directly) but there is not an advertised way
to tell fast-import to commit a given tree without unpacking it.
This patch changes that, by allowing
M 040000 <tree id> ""
as a filemodify line in a commit to reset to a particular tree without
any need to parse it. For example,
M 040000 4b825dc642 ""
is a synonym for the deleteall command and the fast-import equivalent of
git read-tree 4b825dc642
Signed-off-by: David Barr <david.barr@cordelta.com>
Commit-message-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Tested-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixed all places where it was a straightforward change from cd'ing into a
directory and back via "cd .." to a cd inside a subshell.
Found these places with "git grep -w "cd \.\.".
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dump_marks_helper() has a bug when dumping marks larger than 2^20-1,
i.e., when the sparse array has more than two levels. The bug was
that the 'base' counter was being shifted by 20 bits at level 3, and
then again by 10 bits at level 2, rather than a total shift of 20 bits
in this argument to the recursive call:
(base + k) << m->shift
There are two ways to fix this correctly, the elegant:
(base + k) << 10
and the one I chose due to edit distance:
base + (k << m->shift)
Signed-off-by: Raja R Harinath <harinath@hurrynot.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To simulate the svn cp command, it would be very useful to be
replace an arbitrary file in the current revision by an
arbitrary directory from a previous one. Modify the filemodify
command to allow that:
M 040000 <tree id> pathname
This would be most useful in combination with a facility to
print the commit ids for new revisions as they are written.
Cc: Shawn O. Pearce <spearce@spearce.org>
Cc: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/maint-fast-import-large-blob:
fast-import: Stream very large blobs directly to pack
bash: don't offer remote transport helpers as subcommands
Conflicts:
fast-import.c
If a blob is larger than the configured big-file-threshold, instead
of reading it into a single buffer obtained from malloc, stream it
onto the end of the current pack file. Streaming the larger objects
into the pack avoids the 4+ GiB memory footprint that occurs when
fast-import is processing 2+ GiB blobs.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jh/notes' (early part):
Add more testcases to test fast-import of notes
Rename t9301 to t9350, to make room for more fast-import tests
fast-import: Proper notes tree manipulation
This patch teaches 'git fast-import' to automatically organize note objects
in a fast-import stream into an appropriate fanout structure. The notes API
in notes.h is NOT used to accomplish this, because trying to keep the
fast-import and notes data structures in sync would yield a significantly
larger patch with higher complexity.
Note objects are added with the 'N' command, and accounted for with a
per-branch counter, which is used to trigger fanout restructuring when
needed. Note that when restructuring the branch tree, _any_ entry whose
path consists of 40 hex chars (not including directory separators) will
be recognized as a note object. It is therefore not advisable to
manipulate note entries with M/D/R/C commands.
Since note objects are stored in the same tree structure as other objects,
the unloading and reloading of a fast-import branches handle note objects
transparently.
This patch has been improved by the following contributions:
- Shawn O. Pearce: Several style- and logic-related improvements
Cc: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After specifying 'feature relative-marks' the paths specified with
'feature import-marks' and 'feature export-marks' are relative to an
internal directory in the current repository.
In git-fast-import this means that the paths are relative to the
'.git/info/fast-import' directory. However, other importers may use a
different location.
Add 'feature non-relative-marks' to disable this behavior, this way
it is possible to, for example, specify the import-marks location as
relative, and the export-marks location as non-relative.
Also add tests to verify this behavior.
Cc: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --import-marks= option may be specified multiple times on the
commandline and should result in all marks being read in. Only one
import-marks feature may be specified in the stream, which is
overriden by any --import-marks= commandline options.
If one wishes to specify import-marks files in addition to the one
specified in the stream, it is easy to repeat the stream option as a
--import-marks= commandline option.
Also verify this behavior with tests.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the quiet option and verify that the commandline options
override it.
Also make sure that an unknown option command is rejected and that
non-git options are ignored.
Lastly, show that unknown options are rejected when parsed on the
commandline.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the fronted to require a specific feature to be supported
by the backend, or abort.
Also add support for four initial feature, date-format=, force=,
import-marks=, export-marks=.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand
is similar to 'filemodify', except that no mode is supplied (all notes have
mode 0644), and the path is set to the hex SHA1 of the given "comittish".
This enables fast import of note objects along with their associated commits,
since the notes can now be named using the mark references of their
corresponding commits.
The patch also includes a test case of the added functionality.
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though newer Porcelain tools always record the tagger information
when creating new tags, export/import pair should be able to faithfully
reproduce ancient tag objects that lack tagger information.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
* js/maint-fetch-update-head:
pull: allow "git pull origin $something:$current_branch" into an unborn branch
Fix fetch/pull when run without --update-head-ok
Conflicts:
t/t5510-fetch.sh
Some confusing tutorials suggested that it would be a good idea to fetch
into the current branch with something like this:
git fetch origin master:master
(or even worse: the same command line with "pull" instead of "fetch").
While it might make sense to store what you want to pull, it typically is
plain wrong when the current branch is "master". This should only be
allowed when (an incorrect) "git pull origin master:master" tries to work
around by giving --update-head-ok to underlying "git fetch", and otherwise
we should refuse it, but somewhere along the lines we lost that behavior.
The check for the current branch is now _only_ performed in non-bare
repositories, which is an improvement from the original behaviour.
Some newer tests were depending on the broken behaviour of "git fetch"
this patch fixes, and have been adjusted.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also use "git hash-object" and "git rev-parse" without dash.
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many test scripts assumed that they will start in a 'trash' subdirectory
that is a single level down from the t/ directory, and referred to their
test vector files by asking for files like "../t9999/expect". This will
break if we move the 'trash' subdirectory elsewhere.
To solve this, we earlier introduced "$TEST_DIRECTORY" so that they can
refer to t/ directory reliably. This finally makes all the tests use
it to refer to the outside environment.
With this patch, and a one-liner not included here (because it would
contradict with what Dscho really wants to do):
| diff --git a/t/test-lib.sh b/t/test-lib.sh
| index 70ea7e0..60e69e4 100644
| --- a/t/test-lib.sh
| +++ b/t/test-lib.sh
| @@ -485,7 +485,7 @@ fi
| . ../GIT-BUILD-OPTIONS
|
| # Test repository
| -test="trash directory"
| +test="trash directory/another level/yet another"
| rm -fr "$test" || {
| trap - exit
| echo >&5 "FATAL: Cannot prepare test area"
all the tests still pass, but we would want extra sets of eyeballs on this
type of change to really make sure.
[jc: with help from Stephan Beyer on http-push tests I do not run myself;
credits for locating silly quoting errors go to Olivier Marin.]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently fast-import/export cannot be used for
repositories with submodules. This patch extends
the relevant programs to make them correctly
process gitlinks.
Links can be represented by two forms of the
Modify command:
M 160000 SHA1 some/path
which sets the link target explicitly, or
M 160000 :mark some/path
where the mark refers to a commit. The latter
form can be used by importing tools to build
all submodules simultaneously in one physical
repository, and then simply fetch them apart.
Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch changes every occurrence of "! git" -- with the meaning
that a git call has to gracefully fail -- into "test_must_fail git".
This is useful to
- make sure the test does not fail because of a signal,
e.g. SIGSEGV, and
- advertise the use of "test_must_fail" for new tests.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a general principle, we should not use "git diff" to validate the
results of what git command that is being tested has done. We would not
know if we are testing the command in question, or locating a bug in the
cute hack of "git diff --no-index".
Rather use test_cmp for that purpose.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
GIT 1.5.4.4
ident.c: reword error message when the user name cannot be determined
Fix dcommit, rebase when rewriteRoot is in use
Really make the LF after reset in fast-import optional
cmd_from() ends with a call to read_next_command(), which is needed
when using cmd_from() from commands where from is not the last element.
With reset, however, "from" is the last command, after which the flow
returns to the main loop, which calls read_next_command() again.
Because of this, always set unread_command_buf in cmd_reset_branch(),
even if cmd_from() was successful.
Add a test case for this in t9300-fast-import.sh.
Signed-off-by: Adeodato Simó <dato@net.com.org.es>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, test_expect_failure was designed to be the opposite
of test_expect_success, but this was a bad decision. Most tests
run a series of commands that leads to the single command that
needs to be tested, like this:
test_expect_{success,failure} 'test title' '
setup1 &&
setup2 &&
setup3 &&
what is to be tested
'
And expecting a failure exit from the whole sequence misses the
point of writing tests. Your setup$N that are supposed to
succeed may have failed without even reaching what you are
trying to test. The only valid use of test_expect_failure is to
check a trivial single command that is expected to fail, which
is a minority in tests of Porcelain-ish commands.
This large-ish patch rewrites all uses of test_expect_failure to
use test_expect_success and rewrites the condition of what is
tested, like this:
test_expect_success 'test title' '
setup1 &&
setup2 &&
setup3 &&
! this command should fail
'
test_expect_failure is redefined to serve as a reminder that
that test *should* succeed but due to a known breakage in git it
currently does not pass. So if git-foo command should create a
file 'bar' but you discovered a bug that it doesn't, you can
write a test like this:
test_expect_failure 'git-foo should create bar' '
rm -f bar &&
git foo &&
test -f bar
'
This construct acts similar to test_expect_success, but instead
of reporting "ok/FAIL" like test_expect_success does, the
outcome is reported as "FIXED/still broken".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing checkpoint command is very useful to force fast-import
to dump the branches out to disk so that standard Git tools can
access them and the objects they refer to. However there was not a
way to know when fast-import had finished executing the checkpoint
and it was safe to read those refs.
The progress command can be used to make fast-import output any
message of the frontend's choosing to standard out. The frontend
can scan for these messages using select() or poll() to monitor a
pipe connected to the standard output of fast-import.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
For the same reasons as the prior change we want to allow frontends
to omit the trailing LF that usually delimits commands. In some
cases these just make the input stream more verbose looking than
it needs to be, and its just simpler for the frontend developer to
get started if our parser is slightly more lenient about where an
LF is required and where it isn't.
To make this optional LF feature work we now have to buffer up to one
line of input in command_buf. This buffering can happen if we look
at the current input command but don't recognize it at this point
in the code. In such a case we need to "unget" the entire line,
but we cannot depend upon the stdio library to let us do ungetc()
for that many characters at once.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A few fast-import frontend developers have found it odd that we
require the LF following a `data` command, especially in the exact
byte count format. Technically we don't need this LF to parse
the stream properly, but having it here does make the stream more
readable to humans. We can easily make the LF optional by peeking
at the next byte available from the stream and pushing it back into
the buffer if its not LF.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Several frontend developers have asked that some form of stream
comments be permitted within a fast-import data stream. This way
they can include information from their own frontend program about
where specific data was taken from in the source system, or about
a decision that their frontend may have made while creating the
fast-import data stream.
This change introduces comments in the Bourne-shell/Tcl/Perl style.
Lines starting with '#' are ignored, up to and including the LF.
Unlike the above mentioned three languages however we do not look for
and ignore leading whitespace. This just simplifies the definition
of the comment format and the code that parses them.
To make comments work we had to stop using read_next_command() within
cmd_data() and directly invoke read_line() during the inline variant
of the function. This is necessary to retain any lines of the
input data that might otherwise look like a comment to fast-import.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a
Git backend for cvs2svn that fast-import was barfing when he tried
to use "TAG_FIXUP" as a branch name for temporary work needed to
cleanup the tree prior to creating an annotated tag object.
The reason we were rejecting the branch name was check_ref_format()
returns -2 when there are less than 2 '/' characters in the input
name. TAG_FIXUP has 0 '/' characters, but is technically just as
valid of a ref as HEAD and MERGE_HEAD, so we really should permit it
(and any other similar looking name) during import.
New test cases have been added to make sure we still detect very
wrong branch names (e.g. containing [ or starting with .) and yet
still permit reasonable names (e.g. TAG_FIXUP).
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The tree recursion behavior of git-diff may appear
inconsistent to the user because it depends on the format of
the patch as well as whether one is diffing between trees or
against the index.
Since git-diff is a porcelain wrapper for low-level diff
commands, it makes sense for its behavior to be consistent
no matter what is being diffed. This patch turns on
recursion in all cases.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some source material (e.g. Subversion dump files) perform directory
renames by telling us the directory was copied, then deleted in the
same revision. This makes it difficult for a frontend to convert
such data formats to a fast-import stream, as all the frontend has
on hand is "Copy a/ to b/; Delete a/" with no details about what
files are in a/, unless the frontend also kept track of all files.
The new 'C' subcommand within a commit allows the frontend to make a
recursive copy of one path to another path within the branch, without
needing to keep track of the individual file paths. The metadata
copy is performed in memory efficiently, but is implemented as a
copy-immediately operation, rather than copy-on-write.
With this new 'C' subcommand frontends could obviously implement an
'R' (rename) on their own as a combination of 'C' and 'D' (delete),
but since we have already offered up 'R' in the past and it is a
trivial thing to keep implemented I'm not going to deprecate it.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some source material (e.g. Subversion dump files) perform directory
renames without telling us exactly which files in that subdirectory
were moved. This makes it hard for a frontend to convert such data
formats to a fast-import stream, as all the frontend has on hand
is "Rename a/ to b/" with no details about what files are in a/,
unless the frontend also kept track of all files.
The new 'R' subcommand within a commit allows the frontend to
rename either a file or an entire subdirectory, without needing to
know the object's SHA-1 or the specific files contained within it.
The rename is performed as efficiently as possible internally,
making it cheaper than a 'D'/'M' pair for a file rename.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>