"git archive" learned to handle files that are larger than 8GB and
commits far in the future than expressible by the traditional US-TAR
format.
* jk/big-and-future-archive-tar:
archive-tar: drop return value
archive-tar: write extended headers for far-future mtime
archive-tar: write extended headers for file sizes >= 8GB
t5000: test tar files that overflow ustar headers
t9300: factor out portable "head -c" replacement
It is sometimes useful to be able to read exactly N bytes from a
pipe. Doing this portably turns out to be surprisingly difficult
in shell scripts.
We want a solution that:
- is portable
- never reads more than N bytes due to buffering (which
would mean those bytes are not available to the next
program to read from the same pipe)
- handles partial reads by looping until N bytes are read
(or we see EOF)
- is resilient to stray signals giving us EINTR while
trying to read (even though we don't send them, things
like SIGWINCH could cause apparently-random failures)
Some possible solutions are:
- "head -c" is not portable, and implementations may
buffer (though GNU head does not)
- "read -N" is a bash-ism, and thus not portable
- "dd bs=$n count=1" does not handle partial reads. GNU dd
has iflags=fullblock, but that is not portable
- "dd bs=1 count=$n" fixes the partial read problem (all
reads are 1-byte, so there can be no partial response).
It does make a lot of write() calls, but for our tests
that's unlikely to matter. It's fairly portable. We
already use it in our tests, and it's unlikely that
implementations would screw up any of our criteria. The
most unknown one would be signal handling.
- perl can do a sysread() loop pretty easily. On my Linux
system, at least, it seems to restart the read() call
automatically. If that turns out not to be portable,
though, it would be easy for us to handle it.
That makes the perl solution the least bad (because we
conveniently omitted "length of code" as a criterion).
It's also what t9300 is currently using, so we can just pull
the implementation from there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fast-import" learned the same performance trick to avoid
creating too small a packfile as "git fetch" and "git push" have,
using *.unpackLimit configuration.
* ew/fast-import-unpack-limit:
fast-import: invalidate pack_id references after loosening
fast-import: implement unpack limit
Certain lines of the marks file might be corrupted (or the objects
missing due to a garbage collection), but that's no reason to truncate
the file and essentially destroy the rest of it.
Ideally missing objects should not cause a crash, we could just skip
them, but that's another patch.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With many incremental imports, small packs become highly
inefficient due to the need to readdir scan and load many
indices to locate even a single object. Frequent repacking and
consolidation may be prohibitively expensive in terms of disk
I/O, especially in large repositories where the initial packs
were aggressively optimized and marked with .keep files.
In those cases, users may be better served with loose objects
and relying on "git gc --auto".
This changes the default behavior of fast-import for small
imports found in test cases, so adjustments to t9300 were
necessary.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_getwholeline() did not NUL-terminate the buffer on certain
corner cases in its error codepath.
* jk/getwholeline-getdelim-empty:
strbuf_getwholeline: NUL-terminate getdelim buffer on error
Commit 0cc30e0 (strbuf_getwholeline: use getdelim if it is
available, 2015-04-16) tries to clean up after getdelim()
returns EOF, but gets one case wrong, which can lead in some
obscure cases to us reading uninitialized memory.
After getdelim() returns -1, we re-initialize the strbuf
only if sb->buf is NULL. The thinking was that either:
1. We fed an existing allocated buffer to getdelim(), and
at most it would have realloc'd, leaving our NUL in
place.
2. We didn't have a buffer to feed, so we gave getdelim()
NULL; sb->buf will remain NULL, and we just want to
restore the empty slopbuf.
But that second case isn't quite right. getdelim() may
allocate a buffer, write nothing into it, and then return
EOF. The resulting strbuf rightfully has sb->len set to "0",
but is missing the NUL terminator in the first byte.
Most call-sites are fine with this. They see the EOF and
don't bother looking at the strbuf. Or they notice that
sb->len is empty, and don't look at the contents. But
there's at least one case that does neither, and relies on
parsing the resulting (possibly zero-length) string:
fast-import. You can see this in action with the new test
(though we probably only notice failure there when run with
--valgrind or ASAN).
We can fix this by unconditionally resetting the strbuf when
we have a buffer after getdelim(). That fixes case 2 above.
Case 1 is probably already fine in practice, but it does not
hurt for us to re-assert our invariants (especially because
we are relying on whatever getdelim() happens to do, which
may vary from platform to platform). Our fix covers that
case, too.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.
The backquoted form is the traditional method for command
substitution, and is supported by POSIX. However, all but the
simplest uses become complicated quickly. In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.
The patch was generated by:
for _f in $(find . -name "*.sh")
do
perl -i -pe 'BEGIN{undef $/;} s/`(.+?)`/\$(\1)/smg' "${_f}"
done
and then carefully proof-read.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our usual style these days is to execute everything inside
test_expect_success. Make it so.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
In the next commit, we will indent test case preparations. This will
require that here-documents ignore the tab indentation. Prepare for
this change by marking the here-doc words accordingly. This does not
have an effect now, but will remove some noise from the git diff -b
output of the next commit.
The change here is entirely automated with this perl command:
perl -i -lpe 's/(cat.*<<) *((EOF|(EXPECT|INPUT)_END).*$)/$1-$2 &&/' t/t9300-fast-import.sh
i.e., inserts a dash between << and the EOF word (and removes blanks
that our style guide abhors) and appends the && that will become
necessary.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
A number of clean-ups of test cases are performed outside of
test_expect_success. Replace these cases by using test_when_finished.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
It is customary to have each command in test snippets on its own line.
Fix those instances that do not follow this guideline.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
Instead of comparing actual output to an empty file, use
test_must_be_empty. In addition to the better error message provided by
the helper, allocation of an empty file during the setup sequence can be
avoided.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
One test case open-codes a test for an expected failure. Replace it by
test_must_fail.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
Many test cases do not follow our modern style that places the
single-quotes that surround the shell code snippets before and after
the shell code. Make it so.
Many of the lines changed in this way are indented other than by a
single tab. Change them (and some additional lines) to be indented
with a tab.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
It is sometimes useful for importers to be able to read the SHA-1
corresponding to a mark that they have created via fast-import. For
example, they might want to embed the SHA-1 into the commit message of
a later commit. Or it might be useful for internal bookkeeping uses,
or for logging.
Add a "get-mark" command to "git fast-import" that allows the importer
to ask for the value of a mark that has been created earlier.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These test scripts likely predate test_must_fail, and can be
made simpler by using it (in addition to making them pass
--chain-lint).
The case in t6036 loses some verbosity in the failure case,
but it is so tied to a specific failure mode that it is not
worth keeping around (and the outcome of the test is not
affected at all).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are tests which are missing a link in their &&-chain,
but during a setup phase. We may fail to notice failure in
commands that build the test environment, but these are
typically not expected to fail at all (but it's still good
to double-check that our test environment is what we
expect).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are tests which are missing a link in their &&-chain,
in a location which causes a significant portion of the test
to be missed (e.g., the test effectively does nothing, or
consists of a long string of actions and output comparisons,
and we throw away the exit code of at least one part of the
string).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test clean-up.
* jc/diff-test-updates:
test_ln_s_add: refresh stat info of fake symbolic links
t4008: modernise style
t/diff-lib: check exact object names in compare_diff_raw
tests: do not borrow from COPYING and README from the real source
t4010: correct expected object names
t9300: correct expected object names
t4008: correct stale comments
The output the test #36 expects is bogus. There are no blob objects
whose names are 36a590... or 046d037... when this test was run.
It was left unnoticed only because compare_diff_raw, which only
cares about the add/delete/rename/copy was used to check the result.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no straightforward way to grep for all tests dealing with
invalid refs. Put them in a single test script so it is easy to see
what functionality has not been exercised with bad ref names yet.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An attempt to remove the entire tree in the "git fast-import" input
stream caused it to misbehave.
* mb/fast-import-delete-root:
fast-import: fix segfault in store_tree()
t9300: test filedelete command
test_cmp is intended to produce diff output for human consumption. The
input in one instance in t9300-fast-import.sh are binary files, however.
Use test_cmp_bin to compare the files.
This was noticed because on Windows we have a special implementation of
test_cmp in pure bash code (to ignore differences due to intermittent CR
in actual output), and bash runs into an infinite loop due to the binary
nature of the input.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Branch tree is NULLified by filedelete command if we are trying
to delete root tree. Add sanity check and use load_tree() in that case.
Signed-off-by: Maxim Bublis <satori@yandex-team.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new fast-import test series for filedelete command.
Signed-off-by: Maxim Bublis <satori@yandex-team.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Support for Back when bdccd3c1 (test-lib: allow negation of
prerequisites, 2012-11-14) introduced negated predicates
(e.g. "!MINGW,!CYGWIN"), we already had 5 test files that use
NOT_MINGW (and a few MINGW) as prerequisites.
Let's not add NOT_FOO and rewrite existing ones as !FOO for both
MINGW and CYGWIN.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of the last commit, we can use "perl" instead of
"$PERL_PATH" when running tests, as the former is now a
function which uses the latter. As the shorter "perl" is
easier on the eyes, let's switch to using it everywhere.
This is not quite a mechanical s/$PERL_PATH/perl/
replacement, though. There are some places where we invoke
perl from a script we generate on the fly, and those scripts
do not have access to our internal shell functions. The
result can be double-checked by running:
ln -s /bin/false bin-wrappers/perl
make test
which continues to pass even after this patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We liberally use "committish" and "commit-ish" (and "treeish" and
"tree-ish"); as these are non-words, let's unify these terms to
their dashed form. More importantly, clarify the documentation on
object peeling using these terms.
* rh/ishes-doc:
glossary: fix and clarify the definition of 'ref'
revisions.txt: fix and clarify <rev>^{<type>}
glossary: more precise definition of tree-ish (a.k.a. treeish)
use 'commit-ish' instead of 'committish'
use 'tree-ish' instead of 'treeish'
glossary: define commit-ish (a.k.a. committish)
glossary: mention 'treeish' as an alternative to 'tree-ish'
Replace 'committish' in documentation and comments with 'commit-ish'
to match gitglossary(7) and to be consistent with 'tree-ish'.
The only remaining instances of 'committish' are:
* variable, function, and macro names
* "(also committish)" in the definition of commit-ish in
gitglossary[7]
Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because fast-import.c::tree_content_remove does not check for the empty
path, it is not possible to move the root tree to a subdirectory.
Instead the error "Path not in branch" is produced (note the double
space where the empty path has been inserted).
Fix this by explicitly checking for the empty path and handling it.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 178e1de (fast-import: don't allow 'ls' of path with empty
components, 2012-03-09) restricted paths which:
. contain an empty directory component (e.g. foo//bar is invalid),
. end with a directory separator (e.g. foo/ is invalid),
. start with a directory separator (e.g. /foo is invalid).
However, the implementation also caught the empty path, which should
represent the root tree. Relax this restriction so that the empty path
is explicitly allowed and refers to the root tree.
Reported-by: Dave Abrahams <dave@boostpro.com>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When given an empty path, fast-import sometimes reports "missing"
instead of using the root tree object. On top of this, for "ls" and
file copy (but not move) it dies with "Empty path component found in
input".
Document this behaviour with failing test cases.
Reported-by: Dave Abrahams <dave@boostpro.com>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'PIPE' test prerequisite was already defined identically by t9010
and t9300, therefore it makes sense to make it a predefined
prerequisite.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some implementations of Perl terminates "lines" with CRLF even when
the script is operating on just a sequence of bytes. Make sure to
use "$PERL_PATH", the version of Perl the user told Git to use, in
our tests to avoid unnecessary breakages in tests.
* vr/use-our-perl-in-tests:
t/README: add a bit more Don'ts
tests: enclose $PERL_PATH in double quotes
t/test-lib.sh: export PERL_PATH for use in scripts
t: Replace 'perl' by $PERL_PATH
Otherwise it will be split at a space after "Program" when it is set
to "\\Program Files\perl" or something silly like that.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GIT-BUILD-OPTIONS defines PERL_PATH to be used in the test suite. Only a
few tests already actually use this variable when perl is needed. The
other test just call 'perl' and it might happen that the wrong perl
interpreter is used.
This becomes problematic on Windows, when the perl interpreter that is
compiled and installed on the Windows system is used, because this perl
interpreter might introduce some unexpected LF->CRLF conversions.
This patch makes sure that $PERL_PATH is used everywhere in the test suite
and that the correct perl interpreter is used.
Signed-off-by: Vincent van Ravesteijn <vfr@lyx.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We fairly consistently say "superproject" and never "supermodule" these
days. But there are seven occurrences of "supermodule" left in the current
work tree. Three appear in Release Notes for 1.5.3 and 1.7.7, three in
test names and one in a C-code comment.
Replace all occurrences of "supermodule" outside of the Release Notes
(which shouldn't be changed after the fact) with "superproject" for
consistency.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Exiting from a for-loop early using '|| break' does not propagate the
failure code, and for this reason, the tests used just 'exit'. But this
ends the test script with 'FATAL: Unexpected exit code 1' in the case of
a failed test.
Fix this by moving the loop into a shell function, from which we can
simply return early.
While at it, modernize the style of the affected test cases.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The syntax for the use of mark references in fast-import
demands either a SP (space) or LF (end-of-line) after
a mark reference. Fast-import does not complain when garbage
appears after a mark reference in some cases.
Factor out parsing of mark references and complain if
errant characters are found. Also be a little more careful
when parsing "inline" and SHA1s, complaining if extra
characters appear or if the form of the dataref is unrecognized.
Buggy input can cause fast-import to produce the wrong output,
silently, without error. This makes it difficult to track
down buggy generators of fast-import streams. An example is
seen in the last line of this commit command:
commit refs/heads/S2
committer Name <name@example.com> 1112912893 -0400
data <<COMMIT
commit message
COMMIT
from :1M 100644 :103 hello.c
It is missing a newline and should be:
[...]
from :1
M 100644 :103 hello.c
What fast-import does is to produce a commit with the same
contents for hello.c as in refs/heads/S2^. What the buggy
program was expecting was the contents of blob :103. While
the resulting commit graph looked correct, the contents in
some commits were wrong.
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrew Sayers noticed that the svn-fe | git fast-import pipeline
mishandles a subversion history that copies the root directory to a
sub-directory (e.g. doing `svn cp . trunk` to standardise your
layout). As David Barr explained, the bug arises when the following
command is sent to git fast-import:
'ls' SP ':1' SP LF
Instead of reading back what is at the root of r1, it unconditionally
reports the path as missing.
After sleeping on it, here are two patches for 'maint'. One plugs a
memory leak. The other ensures that trying to pass an empty path to
the 'ls' command results in an error message that can help the
frontend author instead of the silently broken conversion Andrew
found.
Then we can carefully add 'ls ""' support in 1.7.11.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIbBAABCAAGBQJPWtgPAAoJEN/Gce6zM/olpU4P92DnurNCuUOsCIOVvEHT7YC+
JdBou7wDKug2U0XH4yvTCglJZQ+zWpoobPoJlGu4c+RPWYGoUnOvc2jUmhFGD2r4
jLlNhq3Uob4CSdEA8JLz8IOvgW41kc2LuUsZrr19dC3eS/19U72NQpD5nA958wZw
xmvcu1DjhrsYHSdh4azg0Ccsp3PzesssCGJSckKoYpOZSlbznK3Dyn2KnLPZvAmx
fF6uyPAZXKsKDOyVUPczIQ+mitb5YhOa3eFDcZk/EBM1LUCqbE0qmFyPhFqtE2pP
f0BcG2MwXc6S8FIRTSkzxV3WP8Cl5+3Y3LX+aB9Y5Zl7Vx1LBdDEhEZS0DkJFW1u
c/64Ge89FD/mSKpJfu8iZdga+qalyF1u5fPPFvOKV3p6UO1ou1RuIVEDMUhMzPhi
244i7VnaPq232aMr8Jn4eAnpg+mZpOdDuqsJn4Q/BOpJ2D5UmKzja4LUZk2tkmgh
EzGYLbs//TZXKjeNGH1rBdbW9lO0fU8oGRaTKiRwWTQVAJ48hUlOYfMBsWvUnvI6
A5ar/iT8UthEc54OZIkBefjpFrwIXl3wG1iafDY8Z1rcnGsbH9MpNFLFxcoGQTBJ
9A6ZyqL7dEKg2SWfESXf93xkF/fFZYl+0jFm+VNuJUzkrsvoi4tkxs/3qxaE74Kb
zQtYbp/z+KWFh4eg830=
=wrup
-----END PGP SIGNATURE-----
Merge "two fixes for fast-import's 'ls' command" from Jonathan
Andrew Sayers noticed that the svn-fe | git fast-import pipeline
mishandles a subversion history that copies the root directory to a
sub-directory (e.g. doing `svn cp . trunk` to standardise your
layout). As David Barr explained, the bug arises when the following
command is sent to git fast-import:
'ls' SP ':1' SP LF
Instead of reading back what is at the root of r1, it unconditionally
reports the path as missing.
After sleeping on it, here are two patches for 'maint'. One plugs a
memory leak. The other ensures that trying to pass an empty path to
the 'ls' command results in an error message that can help the
frontend author instead of the silently broken conversion Andrew
found.
Then we can carefully add 'ls ""' support in 1.7.11.
* commit 'refs/pull-request-tags/jn/maint-fast-import-empty-ls':
fast-import: don't allow 'ls' of path with empty components
fast-import: leakfix for 'ls' of dirty trees
As the fast-import manual explains:
The value of <path> must be in canonical form. That is it must
not:
. contain an empty directory component (e.g. foo//bar is invalid),
. end with a directory separator (e.g. foo/ is invalid),
. start with a directory separator (e.g. /foo is invalid),
Unfortunately the "ls" command accepts these invalid syntaxes and
responds by declaring that the indicated path is missing. This is too
subtle and causes importers to silently misbehave; better to error out
so the operator knows what's happening.
The C, R, and M commands already error out for such paths.
Reported-by: Andrew Sayers <andrew-git@pileofstuff.org>
Analysis-by: David Barr <davidbarr@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
As diagnosed by Johannes Sixt, msys.dll does not hand through file
descriptors > 2 to child processes, so these test cases cannot passes when
run through an MSys bash.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'reset' command makes fast-import start a branch from scratch. It's name
is kept in lookup table but it's sha1 is null_sha1 (special value).
'notemodify' command can be used to add a note on branch head given it's
name. lookup_branch() is used it that case and it doesn't check for
null_sha1. So fast-import writes a note for null_sha1 object instead of
giving a error.
Add a check to deny adding a note on empty branch and add a corresponding
test.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'reset' command makes fast-import start a branch from scratch. It's name
is kept in lookup table but it's sha1 is null_sha1 (special value).
'tag' command can be used to tag a branch by it's name. lookup_branch()
is used it that case and it doesn't check for null_sha1. So fast-import
writes a tag for null_sha1 object instead of giving a error.
Add a check to deny tagging an empty branch and add a corresponding test.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* di/fast-import-ident:
fsck: improve committer/author check
fsck: add a few committer name tests
fast-import: check committer name more strictly
fast-import: don't fail on omitted committer name
fast-import: add input format tests
fast-import allows to tag objects by sha1 and to query sha1 of objects
being imported. So it should allow to tag these objects, make it do so.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import allows to create an annotated tag that annotates a blob,
via mark or direct sha1 specification.
For mark it works, for sha1 it tries to read the object. It tries to
do so via read_sha1_file, and then checks the size to be at least 46.
That's weird, let's just allow to (annotated) tag any object referenced
by sha1. If the object originates from our packfile, we still fail though.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import command-line option --import-marks-if-exists was introduced
in commit dded4f1 (fast-import: Introduce --import-marks-if-exists, 2011-01-15)
--import-marks option can be set via a "feature" command in a fast-import
stream and --import-marks-if-exists had support for such specification
from the very beginning too due to some shared codebase. Though the
documentation for this feature wasn't written in dded4f1.
Add the documentation for "feature import-marks-if-exists=<file>". Also add
a minimalistic test for it.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To produce deltas for tree objects fast-import tracks two versions
of tree's entries - base and current one. Base version stands both
for a delta base of this tree, and for a entry inside a delta base
of a parent tree. So care should be taken to keep it in sync.
tree_content_set cuts away a whole subtree and replaces it with a
new one (or NULL for lazy load of a tree with known sha1). It
keeps a base sha1 for this subtree (needed for parent tree). And
here is the problem, 'subtree' tree root doesn't have the implied
base version entries.
Adjusting the subtree to include them would mean a deep rewrite of
subtree. Invalidating the subtree base version would mean recursive
invalidation of parents' base versions. So just mark this tree as
do-not-delta me. Abuse setuid bit for this purpose.
tree_content_replace is the same as tree_content_set except that is
is used to replace the root, so just clearing base sha1 here (instead
of setting the bit) is fine.
[di: log message]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import is able to write imported tree objects in delta format.
It holds a tree structure in memory where each tree entry may have
a delta base sha1 assigned. When delta base data is needed it is
reconstructed from this in-memory structure. Though sometimes the
delta base data doesn't match the delta base sha1 so wrong or even
corrupt pack is produced.
Add a small test that produces a corrupt pack. It uses just tree
copy and file modification commands aside from the very basic commit
and blob commands.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation declares following identity format:
(<name> SP)? LT <email> GT
where name is any string without LF and LT characters.
But fast-import just accepts any string up to first GT
instead of checking the whole format, and moreover just
writes it as is to the commit object.
git-fsck checks for [^<\n]* <[^<>\n]*> format. Note that the
space is mandatory. And the space quirk is already handled via
extending the string to the left when needed.
Modify fast-import input identity format to a slightly stricter
one - deny LF, LT and GT in both <name> and <email>. And check
for it.
This is stricter then git-fsck as fsck accepts "Name> <email>"
currently, but soon fsck check will be adjusted likewise.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import format declares 'committer_name SP' to be optional in
'committer_name SP LT email GT'. But for a (commit) object SP is
obligatory while zero length committer_name is ok. git-fsck checks
that SP is present, so fast-import must prepend it if the name SP
part is omitted. It doesn't do so and thus for "LT email GT" ident
it writes a bad object.
Name cannot contain LT or GT, ident always comes after SP in fast-import.
So if ident starts with LT reuse the SP as if a valid 'SP LT email GT'
ident was passed.
This fixes a ident parsing bug for a well-formed fast-import input.
Though the parsing is still loose and can accept a ill-formed input.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/git-fast-import.txt says that git-fast-import is strict
about it's input format. But committer/author field parsing is a bit
loose. Invalid values can be unnoticed and written out to the commit,
either with format-conforming input or with non-format-conforming one.
Add one passing and one failing test for empty/absent committer name
with well-formed input. And a failed test with unnoticed ill-formed
input.
Reported-by: SASAKI Suguru <sss.sonik@gmail.com>
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sr/transport-helper-fix: (21 commits)
transport-helper: die early on encountering deleted refs
transport-helper: implement marks location as capability
transport-helper: Use capname for refspec capability too
transport-helper: change import semantics
transport-helper: update ref status after push with export
transport-helper: use the new done feature where possible
transport-helper: check status code of finish_command
transport-helper: factor out push_update_refs_status
fast-export: support done feature
fast-import: introduce 'done' command
git-remote-testgit: fix error handling
git-remote-testgit: only push for non-local repositories
remote-curl: accept empty line as terminator
remote-helpers: export GIT_DIR variable to helpers
git_remote_helpers: push all refs during a non-local export
transport-helper: don't feed bogus refs to export push
git-remote-testgit: import non-HEAD refs
t5800: document some non-functional parts of remote helpers
t5800: use skip_all instead of prereq
t5800: factor out some ref tests
...
Add a 'done' command that causes fast-import to stop reading from the
stream and exit.
If the new --done command line flag was passed on the command line
(or a "feature done" declaration included at the start of the stream),
make the 'done' command mandatory. So "git fast-import --done"'s
input format will be prefix-free, making errors easier to detect when
they show up as early termination at some convenient time of the
upstream of a pipe writing to fast-import.
Another possible application of the 'done' command would to be allow a
fast-import stream that is only a small part of a larger encapsulating
stream to be easily parsed, leaving the file offset after the "done\n"
so the other application can pick up from there. This patch does not
teach fast-import to do that --- fast-import still uses buffered input
(stdio).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs.c had a error message "Trying to write ref with nonexistant object".
And no tests relied on the wrong spelling.
Also typo was present in some test scripts internals, these tests still pass.
Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lazy fast-import frontend authors that want to rely on the backend to
keep track of the content of the imported trees _almost_ have what
they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28).
But it is not quite enough, since
(1) cat-blob can be used to retrieve the content of files, but
not their mode, and
(2) using cat-blob requires the frontend to keep track of a name
(mark number or object id) for each blob to be retrieved
Introduce an 'ls' command to complement cat-blob and take care of the
remaining needs. The 'ls' command finds what is at a given path
within a given tree-ish (tag, commit, or tree):
'ls' SP <dataref> SP <path> LF
or in fast-import's active commit:
'ls' SP <path> LF
The response is a single line sent through the cat-blob channel,
imitating ls-tree output. So for example:
FE> ls :1 Documentation
gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation
FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt
gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt
FE> ls :1 RelNotes
gfi> 120000 blob b942e49944 RelNotes
FE> cat-blob b942e49944
gfi> b942e49944 blob 32
gfi> Documentation/RelNotes/1.7.4.txt
The most interesting parts of the reply are the first word, which is
a 6-digit octal mode (regular file, executable, symlink, directory,
or submodule), and the part from the second space to the tab, which is
a <dataref> that can be used in later cat-blob, ls, and filemodify (M)
commands to refer to the content (blob, tree, or commit) at that path.
If there is nothing there, the response is "missing some/path".
The intent is for this command to be used to read files from the
active commit, so a frontend can apply patches to them, and to copy
files and directories from previous revisions.
For example, proposed updates to svn-fe use this command in place of
its internal representation of the repository directory structure.
This simplifies the frontend a great deal and means support for
resuming an import in a separate fast-import run (i.e., incremental
import) is basically free.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Sverre Rabbelier <srabbelier@gmail.com>
* maint:
rebase -i: clarify in-editor documentation of "exec"
tests: sanitize more git environment variables
fast-import: treat filemodify with empty tree as delete
rebase: give a better error message for bogus branch
rebase: use explicit "--" with checkout
Conflicts:
t/t9300-fast-import.sh
Normal git processes do not allow one to build a tree with an empty
subtree entry without trying hard at it. This is in keeping with the
general UI philosophy: git tracks content, not empty directories.
v1.7.3-rc0~75^2 (2010-06-30) changed that by making it easy to include
an empty subtree in fast-import's active commit:
M 040000 4b825dc642 subdir
One can trigger this by reading an empty tree (for example, the tree
corresponding to an empty root commit) and trying to move it to a
subtree. It is better and more closely analogous to 'git read-tree
--prefix' to treat such commands as requests to remove the subtree.
Noticed-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a frontend uses a marks file to ensure its state persists between
runs, it may represent "clean slate" when bootstrapping with "no marks
yet". In such a case, feeding the last state with --import-marks and
saving the state after the current run with --export-marks would be a
natural thing to do.
The --import-marks option however errors out when the specified marks file
doesn't exist; this makes bootstrapping a bit difficult. The location of
the marks file becomes backend-dependent when --relative-marks is in
effect, and the frontend cannot check for the existence of the file in
such a case.
The --import-marks-if-exists option does the same thing as --import-marks
but does not flag an error if the named file does not exist yet to help
these frontends.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is unfortunate to have to issue thousands of one-byte read calls to
work around dd's refusal to buffer input that would fill a block after
a short read (a3a6f4, 2010-12-13). We could do better by using
"head -c", if it were available on all platforms we cared about.
Replace it with some simple perl.
While doing so, restructure 9300.114 to use a subshell instead of a
script. Subshells can inherit functions (like the new head_c) from
the parent shell while external scripts cannot.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/fast-import-blob-access:
t9300: avoid short reads from dd
t9300: remove unnecessary use of /dev/stdin
fast-import: Allow cat-blob requests at arbitrary points in stream
fast-import: let importers retrieve blobs
fast-import: clarify documentation of "feature" command
fast-import: stricter parsing of integer options
Conflicts:
fast-import.c
dd is a thin wrapper around read(2). As open group Issue 7 explains:
It shall read the input one block at a time, using the specified
input block size; it shall then process the block of data
actually returned, which could be smaller than the requested
block size.
Any short read --- for example from a pipe whose capacity cannot fill
a block --- results in that block being truncated. As a result, the
first cat-blob test (9300.114) fails on Mac OS X, where the pipe
capacity is around 8 KiB.
Fix the test by using a block size of 1. Each read will block until
the next byte of input is available.
It would be even nicer to use head -c which expresses the intention
more clearly. Alas, IRIX "head" does not support the -c option.
Reported-by: Brian Gernhardt <brian@gernhardtsoftware.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We really shouldn't be using these funny /dev/* files that did not exist
in V7 UNIX in our tests when we do not have to.
Output from
$ git grep -n -e /dev/ --and --not -e /dev/null t/
tells us that, aside from use of /dev/urandom in apache.conf used in http
tests, "dd if=/dev/stdin" added recently to t/t9300-fast-import.sh are the
only offenders, and "dd" reads from the standard input by default, so
removing them should be straightforward.
Reported-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new rule: a "cat-blob" can be inserted wherever a comment is
allowed, which means at the start of any line except in the middle of
a "data" command.
This saves frontends from having to loop over everything they want to
commit in the next commit and cat-ing the necessary objects in
advance.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New objects written by fast-import are not available immediately.
Until a checkpoint has been started and finishes writing the pack
index, any new blobs will not be accessible using standard git tools.
So introduce a new way to access them: a "cat-blob" command in the
command stream requests for fast-import to print a blob to stdout or a
file descriptor specified by the argument to --cat-blob-fd. The value
for cat-blob-fd cannot be specified in the stream because that would
be a layering violation: the decision of where to direct a stream has
to be made when fast-import is started anyway, so we might as well
make the stream format is independent of that detail.
Output uses the same format as "git cat-file --batch".
Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing
the protocol.
Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check the result from strtoul to avoid accepting arguments like
--depth=-1 and --active-branches=foo,bar,baz.
Requested-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/fast-import-fix:
fast-import: do not clear notes in do_change_note_fanout()
t9300 (fast-import): another test for the "replace root" feature
fast-import: tighten M 040000 syntax
fast-import: filemodify after M 040000 <tree> "" crashes
Breaks in a test assertion's && chain can potentially hide
failures from earlier commands in the chain.
Commands intended to fail should be marked with !, test_must_fail, or
test_might_fail. The examples in this patch do not require that.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another test for the replace root feature. One can imagine an
implementation for which R "some/subdir" "" would free some state
associated to the subdir and leave fast-import confused.
Luckily, git's is not such an implementation.
While at it, change the previous test to use C "some/subdir" ""
instead of R (i.e., test both syntaxes).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When tree_content_set() is asked to modify the path "foo/bar/",
it first recurses like so:
tree_content_set(root, "foo/bar/", sha1, S_IFDIR) ->
tree_content_set(root:foo, "bar/", ...) ->
tree_content_set(root:foo/bar, "", ...)
And as a side-effect of 2794ad5 (fast-import: Allow filemodify to set
the root, 2010-10-10), this last call is accepted and changes
the tree entry for root:foo/bar to refer to the specified tree.
That seems safe enough but let's reject the new syntax (we never meant
to support it) and make it harder for frontends to introduce pointless
incompatibilities with git fast-import 1.7.3.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Until M 040000 <tree> "" syntax was introduced in commit 2794ad5
(fast-import: Allow filemodify to set the root, 2010-10-10), it
was impossible for the root entry to refer to an unloaded tree.
Update various functions to take that possibility into account.
Otherwise
M 040000 <tree> ""
M 100644 :1 "foo"
and similar commands (using D, C, or R after resetting the root
tree) segfault.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
v1.7.3-rc0~75^2 (Teach fast-import to import subtrees named by tree id,
2010-06-30) has a shortcoming - it doesn't allow the root to be set.
Extend this behaviour by allowing the root to be referenced as the
empty path, "".
For a command (like filter-branch --subdirectory-filter) that wants
to commit a lot of trees that already exist in the object db, writing
undeltified objects as loose files only to repack them later can
involve a significant amount of overhead.
(23% slow-down observed on Linux 2.6.35, worse on Mac OS X 10.6)
Fortunately we have fast-import (which is one of the only git commands
that will write to a pack directly) but there is not an advertised way
to tell fast-import to commit a given tree without unpacking it.
This patch changes that, by allowing
M 040000 <tree id> ""
as a filemodify line in a commit to reset to a particular tree without
any need to parse it. For example,
M 040000 4b825dc642 ""
is a synonym for the deleteall command and the fast-import equivalent of
git read-tree 4b825dc642
Signed-off-by: David Barr <david.barr@cordelta.com>
Commit-message-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Tested-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixed all places where it was a straightforward change from cd'ing into a
directory and back via "cd .." to a cd inside a subshell.
Found these places with "git grep -w "cd \.\.".
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dump_marks_helper() has a bug when dumping marks larger than 2^20-1,
i.e., when the sparse array has more than two levels. The bug was
that the 'base' counter was being shifted by 20 bits at level 3, and
then again by 10 bits at level 2, rather than a total shift of 20 bits
in this argument to the recursive call:
(base + k) << m->shift
There are two ways to fix this correctly, the elegant:
(base + k) << 10
and the one I chose due to edit distance:
base + (k << m->shift)
Signed-off-by: Raja R Harinath <harinath@hurrynot.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To simulate the svn cp command, it would be very useful to be
replace an arbitrary file in the current revision by an
arbitrary directory from a previous one. Modify the filemodify
command to allow that:
M 040000 <tree id> pathname
This would be most useful in combination with a facility to
print the commit ids for new revisions as they are written.
Cc: Shawn O. Pearce <spearce@spearce.org>
Cc: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/maint-fast-import-large-blob:
fast-import: Stream very large blobs directly to pack
bash: don't offer remote transport helpers as subcommands
Conflicts:
fast-import.c
If a blob is larger than the configured big-file-threshold, instead
of reading it into a single buffer obtained from malloc, stream it
onto the end of the current pack file. Streaming the larger objects
into the pack avoids the 4+ GiB memory footprint that occurs when
fast-import is processing 2+ GiB blobs.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jh/notes' (early part):
Add more testcases to test fast-import of notes
Rename t9301 to t9350, to make room for more fast-import tests
fast-import: Proper notes tree manipulation
This patch teaches 'git fast-import' to automatically organize note objects
in a fast-import stream into an appropriate fanout structure. The notes API
in notes.h is NOT used to accomplish this, because trying to keep the
fast-import and notes data structures in sync would yield a significantly
larger patch with higher complexity.
Note objects are added with the 'N' command, and accounted for with a
per-branch counter, which is used to trigger fanout restructuring when
needed. Note that when restructuring the branch tree, _any_ entry whose
path consists of 40 hex chars (not including directory separators) will
be recognized as a note object. It is therefore not advisable to
manipulate note entries with M/D/R/C commands.
Since note objects are stored in the same tree structure as other objects,
the unloading and reloading of a fast-import branches handle note objects
transparently.
This patch has been improved by the following contributions:
- Shawn O. Pearce: Several style- and logic-related improvements
Cc: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After specifying 'feature relative-marks' the paths specified with
'feature import-marks' and 'feature export-marks' are relative to an
internal directory in the current repository.
In git-fast-import this means that the paths are relative to the
'.git/info/fast-import' directory. However, other importers may use a
different location.
Add 'feature non-relative-marks' to disable this behavior, this way
it is possible to, for example, specify the import-marks location as
relative, and the export-marks location as non-relative.
Also add tests to verify this behavior.
Cc: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --import-marks= option may be specified multiple times on the
commandline and should result in all marks being read in. Only one
import-marks feature may be specified in the stream, which is
overriden by any --import-marks= commandline options.
If one wishes to specify import-marks files in addition to the one
specified in the stream, it is easy to repeat the stream option as a
--import-marks= commandline option.
Also verify this behavior with tests.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the quiet option and verify that the commandline options
override it.
Also make sure that an unknown option command is rejected and that
non-git options are ignored.
Lastly, show that unknown options are rejected when parsed on the
commandline.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the fronted to require a specific feature to be supported
by the backend, or abort.
Also add support for four initial feature, date-format=, force=,
import-marks=, export-marks=.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>