Commit Graph

174 Commits

Author SHA1 Message Date
Nicolas Pitre
abeb40e5aa improve reliability of fixup_pack_header_footer()
Currently, this function has the potential to read corrupted pack data
from disk and give it a valid SHA1 checksum.  Let's add the ability to
validate SHA1 checksum of existing data along the way, including before
and after any arbitrary point in the pack.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-29 21:51:27 -07:00
Alexander Gavrilov
03db4525d3 Support gitlinks in fast-import.
Currently fast-import/export cannot be used for
repositories with submodules. This patch extends
the relevant programs to make them correctly
process gitlinks.

Links can be represented by two forms of the
Modify command:

M 160000 SHA1 some/path

which sets the link target explicitly, or

M 160000 :mark some/path

where the mark refers to a commit. The latter
form can be used by importing tools to build
all submodules simultaneously in one physical
repository, and then simply fetch them apart.

Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-19 11:25:51 -07:00
Stephan Beyer
1b1dd23f2d Make usage strings dash-less
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form.  So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.

This patch makes git commands show the dash-less version.

For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.

Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-13 14:12:48 -07:00
Linus Torvalds
4c81b03e30 Make pack creation always fsync() the result
This means that we can depend on packs always being stable on disk,
simplifying a lot of the object serialization worries.  And unlike loose
objects, serializing pack creation IO isn't going to be a performance
killer.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-31 14:46:57 -07:00
Junio C Hamano
9bd81e4249 Merge branch 'js/config-cb'
* js/config-cb:
  Provide git_config with a callback-data parameter

Conflicts:

	builtin-add.c
	builtin-cat-file.c
2008-05-25 14:25:02 -07:00
Miklos Vajna
b30317819d git-fast-import: rename cmd_*() functions to parse_*()
There is a cmd_merge() function in fast-import that will conflict with
builtin-merge's cmd_merge() function. To keep it consistent, rename all
cmd_*() function to parse_*()

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-16 12:40:09 -07:00
Johannes Schindelin
ef90d6d420 Provide git_config with a callback-data parameter
git_config() only had a function parameter, but no callback data
parameter.  This assumes that all callback functions only modify
global variables.

With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-14 12:34:44 -07:00
Eyvind Bernhardsen
198724ad4e fast-import: Allow "reset" to delete a new branch without error
Creating a branch in fast-import and then resetting it without making
any further commits to it currently causes an error message at the
end of the import.

This error is triggered by cvs2svn's git backend, which uses a
temporary fixup branch when it creates tags, because the fixup branch
is reset after each tag.

This patch prevents the error, allowing "reset" to be used to delete
temporary branches.

Signed-off-by: Eyvind Bernhardsen <eyvind-git@orakel.ntnu.no>
Acked-by: Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-16 14:24:32 -07:00
Junio C Hamano
ad416ed433 Merge branch 'maint' to sync with 1.5.4.4
* maint:
  GIT 1.5.4.4
  ident.c: reword error message when the user name cannot be determined
  Fix dcommit, rebase when rewriteRoot is in use
  Really make the LF after reset in fast-import optional
2008-03-08 20:07:57 -08:00
Adeodato Simó
655e8515f2 Really make the LF after reset in fast-import optional
cmd_from() ends with a call to read_next_command(), which is needed
when using cmd_from() from commands where from is not the last element.

With reset, however, "from" is the last command, after which the flow
returns to the main loop, which calls read_next_command() again.

Because of this, always set unread_command_buf in cmd_reset_branch(),
even if cmd_from() was successful.

Add a test case for this in t9300-fast-import.sh.

Signed-off-by: Adeodato Simó <dato@net.com.org.es>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-08 10:46:10 -08:00
Jean-Luc Herren
733ee2b7a9 fast-import: exit with proper message if not a git dir
git fast-import expects to be run from an existing (possibly
empty) repository.  It was dying with a suboptimal message if that
wasn't the case.

Signed-off-by: Jean-Luc Herren <jlh@gmx.ch>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
2008-03-02 16:07:41 -08:00
Shawn O. Pearce
118805b920 Finish current packfile during fast-import crash handler
If fast-import is in the middle of crashing due to a protocol error
or something like that then it can be very useful to have the mark
table and all objects up until that point be available for a new
import to resume from.

Currently we just close the active packfile, unkeep all of our
newly created packfiles (so they can be deleted), and dump the
marks table to a temporary file.

We don't attempt to update the refs/tags that the process has in
memory as much of that data can be found in the crash report and I'm
not sure it would be the right thing to do under every type of crash.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-16 00:47:07 -08:00
Shawn O. Pearce
3b08e5b8c9 Include the fast-import marks table in crash reports
If fast-import was not run with --export-marks but we are crashing
the frontend application developer may still benefit from having
that information available to them.  We now include the marks table
as part of the crash report if --export-marks was not supplied on
the command line.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-16 00:47:07 -08:00
Shawn O. Pearce
fbc63ea694 Include annotated tags in fast-import crash reports
If annotated tags were created they exist in a different namespace
within the fast-import process' internal memory tables so we did
not export them in the inactive branch table.  Now they are written
out after the branches, in the order that they were defined by the
frontend process.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-16 00:47:07 -08:00
Shawn O. Pearce
e8b32e0610 fast-import: check return value from unpack_entry()
If the tree object we have asked for is deltafied in the packfile and
the delta did not apply correctly or was not able to be decompressed
from the packfile then we can get back NULL instead of the tree data.
This is (part of) the reason why read_sha1_file() can return NULL, so
we need to also handle it the same way.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 20:11:51 -08:00
Shawn O. Pearce
7422bac441 Document the hairy gfi_unpack_entry part of fast-import
Junio pointed out this part of fast-import wasn't very clear on
initial read, and it took some time for someone who was new to
fast-import's "dirty little tricks" to understand how this was
even working.  So a little bit of commentary in the proper place
may help future readers.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21 01:04:12 -08:00
Shawn O. Pearce
bb23fdfa6c Teach fast-import to honor pack.compression and pack.depth
We now use the configured pack.compression and pack.depth values
within fast-import, as like builtin-pack-objects fast-import is
generating a packfile for consumption by the Git tools.

We use the same behavior as builtin-pack-objects does for these
options, allowing core.compression to supply the default value
for pack.compression.

The default setting for pack.depth within fast-import is still 10
as users will generally repack fast-import generated packfiles by
`repack -f`.  A large delta depth within the fast-import packfile
can significantly slow down such a later repack.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21 01:04:10 -08:00
Jim Meyering
5a7b1b571e fast-import: Don't use a maybe-clobbered errno value
Without this change, each diagnostic could use an errno value
clobbered by the close or unlink in rollback_lock_file.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-18 13:19:37 -08:00
Shawn O. Pearce
c9ced051c3 Fix random fast-import errors when compiled with NO_MMAP
fast-import was relying on the fact that on most systems mmap() and
write() are synchronized by the filesystem's buffer cache.  We were
relying on the ability to mmap() 20 bytes beyond the current end
of the file, then later fill in those bytes with a future write()
call, then read them through the previously obtained mmap() address.

This isn't always true with some implementations of NFS, but it is
especially not true with our NO_MMAP=YesPlease build time option used
on some platforms.  If fast-import was built with NO_MMAP=YesPlease
we used the malloc()+pread() emulation and the subsequent write()
call does not update the trailing 20 bytes of a previously obtained
"mmap()" (aka malloc'd) address.

Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to
be unable to read an object header (or data) that has been unlucky
enough to be written to the packfile at a location such that it
is in the trailing 20 bytes of a window previously opened on that
same packfile.

This bug has gone unnoticed for a very long time as it is highly data
dependent.  Not only does the object have to be placed at the right
position, but it also needs to be positioned behind some other object
that has been accessed due to a branch cache invalidation.  In other
words the stars had to align just right, and if you did run into
this bug you probably should also have purchased a lottery ticket.

Fortunately the workaround is a lot easier than the bug explanation.

Before we allow unpack_entry() to read data from a pack window
that has also (possibly) been modified through write() we force
all existing windows on that packfile to be closed.  By closing
the windows we ensure that any new access via the emulated mmap()
will reread the packfile, updating to the current file content.

This comes at a slight performance degredation as we cannot reuse
previously cached windows when we update the packfile.  But it
is a fairly minor difference as the window closes happen at only
two points:

 - When the packfile is finalized and its .idx is generated:

   At this stage we are getting ready to update the refs and any
   data access into the packfile is going to be random, and is
   going after only the branch tips (to ensure they are valid).
   Our existing windows (if any) are not likely to be positioned
   at useful locations to access those final tip commits so we
   probably were closing them before anyway.

 - When the branch cache missed and we need to reload:

   At this point fast-import is getting change commands for the next
   commit and it needs to go re-read a tree object it previously
   had written out to the packfile.  What windows we had (if any)
   are not likely to cover the tree in question so we probably were
   closing them before anyway.

We do try to avoid unnecessarily closing windows in the second case
by checking to see if the packfile size has increased since the
last time we called unpack_entry() on that packfile.  If the size
has not changed then we have not written additional data, and any
existing window is still vaild.  This nicely handles the cases where
fast-import is going through a branch cache reload and needs to read
many trees at once.  During such an event we are not likely to be
updating the packfile so we do not cycle the windows between reads.

With this change in place t9301-fast-export.sh (which was broken
by c3b0dec509) finally works again.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-17 22:39:20 -08:00
Brandon Casey
fb54abd604 fast-import.c: don't try to commit marks file if write failed
We also move the assignment of -1 to the lock file descriptor
up, so that rollback_lock_file() can be called safely after a
possible attempt to fclose(). This matches the contents of
the 'if' statement just above testing success of fdopen().

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-17 22:11:42 -08:00
Brandon Casey
4ed7cd3ab0 Improve use of lockfile API
Remove remaining double close(2)'s.  i.e. close() before
commit_locked_index() or commit_lock_file().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-16 15:35:35 -08:00
Jim Meyering
95693d45ee bundle, fast-import: detect write failure
I noticed some unchecked writes.  This fixes them.

* bundle.c (create_bundle): Die upon write failure.
* fast-import.c (keep_pack): Die upon write or close failure.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-10 01:08:11 -08:00
Junio C Hamano
257f3020f6 Update callers of check_ref_format()
This updates send-pack and fast-import to use symbolic constants
for checking the return values from check_ref_format(), and also
futureproof the logic in lock_any_ref_for_update() to explicitly
name the case that is usually considered an error but is Ok for
this particular use.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-02 11:20:09 -08:00
David S. Miller
69ae517541 fast-import: fix unalinged allocation and access
The specialized pool allocator fast-import uses aligned objects on the
size of a pointer, which was not sufficient at least on Sparc.  Instead,
make the alignment for objects of type unitmax_t.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14 20:39:16 -08:00
Junio C Hamano
fb5fd01148 Merge branch 'maint'
* maint:
  git-clean: honor core.excludesfile
  Documentation: Fix man page breakage with DocBook XSL v1.72
  git-remote.txt: fix typo
  core-tutorial.txt: Fix argument mistake in an example.
  replace reference to git-rm with git-reset in git-commit doc
  Grammar fixes for gitattributes documentation
  Don't allow fast-import tree delta chains to exceed maximum depth
  revert/cherry-pick: allow starting from dirty work tree.
  t/t3404: fix test for a bogus todo file.

Conflicts:

	fast-import.c
2007-11-14 03:37:18 -08:00
Shawn O. Pearce
436e7a74c6 Don't allow fast-import tree delta chains to exceed maximum depth
Brian Downing noticed fast-import can produce tree depths of up
to 6,035 objects and even deeper.  Long delta chains can create
very small packfiles but cause problems during repacking as git
needs to unpack each tree to count the reachable blobs.

What's happening here is the active branch cache isn't big enough.
We're swapping out the branch and thus recycling the tree information
(struct tree_content) back into the free pool.  When we later reload
the tree we set the delta_depth to 0 but we kept the tree we just
reloaded as a delta base.

So if the tree we reloaded was already at the maximum depth we
wouldn't know it and make the new tree a delta.  Multiply the
number of times the branch cache has to swap out the tree times
max_depth (10) and you get the maximum delta depth of a tree created
by fast-import.  In Brian's case above the active branch cache had
to swap the branch out 603/604 times during this import to produce
a tree with a delta depth of 6035.

Acked-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-13 21:57:53 -08:00
Pierre Habouzit
c2e6b6d0d1 fast-import.c: fix regression due to strbuf conversion
Without this strbuf_detach(), it yields a double free later, the
command is in fact stashed, and this is not a memory leak.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-26 15:28:09 -07:00
Shawn O. Pearce
8a37e21dab Merge branch 'maint'
* maint:
  Describe more 1.5.3.5 fixes in release notes
  Fix diffcore-break total breakage
  Fix directory scanner to correctly ignore files without d_type
  Improve receive-pack error message about funny ref creation
  fast-import: Fix argument order to die in file_change_m
  git-gui: Don't display CR within console windows
  git-gui: Handle progress bars from newer gits
  git-gui: Correctly report failures from git-write-tree
  gitk.txt: Fix markup.
  send-pack: respect '+' on wildcard refspecs
  git-gui: accept versions containing text annotations, like 1.5.3.mingw.1
  git-gui: Don't crash when starting gitk from a browser session
  git-gui: Allow gitk to be started on Cygwin with native Tcl/Tk
  git-gui: Ensure .git/info/exclude is honored in Cygwin workdirs
  git-gui: Handle starting on mapped shares under Cygwin
  git-gui: Display message box when we cannot find git in $PATH
  git-gui: Avoid using bold text in entire gui for some fonts
2007-10-21 02:11:45 -04:00
Julian Phillips
2005dbe2a4 fast-import: Fix argument order to die in file_change_m
The arguments to the "Not a blob" die call in file_change_m were
transposed, so that the command was printed as the type, and the type
as the command.  Switch them around so that the error message comes
out correctly.

Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-20 21:43:35 -04:00
Pierre Habouzit
b315c5c081 strbuf change: be sure ->buf is never ever NULL.
For that purpose, the ->buf is always initialized with a char * buf living
in the strbuf module. It is made a char * so that we can sloppily accept
things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
initializer for ->buf without making gcc unhappy for very good reasons.

strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
anymore.

as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
->buf isn't an option anymore, if ->buf is going to escape from the scope,
and eventually be free'd.

API changes:
  * strbuf_setlen now always works, so just make strbuf_reset a convenience
    macro.
  * strbuf_detatch takes a size_t* optional argument (meaning it can be
    NULL) to copy the buffer's len, as it was needed for this refactor to
    make the code more readable, and working like the callers.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-29 02:13:33 -07:00
Pierre Habouzit
7fb1011e61 Rework unquote_c_style to work on a strbuf.
If the gain is not obvious in the diffstat, the resulting code is more
readable, _and_ in checkout-index/update-index we now reuse the same buffer
to unquote strings instead of always freeing/mallocing.

This also is more coherent with the next patch that reworks quoting
functions.

The quoting function is also made more efficient scanning for backslashes
and treating portions of strings without a backslash at once.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
2007-09-20 23:32:18 -07:00
Pierre Habouzit
c76689df6c strbuf API additions and enhancements.
Add strbuf_remove, change strbuf_insert:
  As both are special cases of strbuf_splice, implement them as such.
  gcc is able to do the math and generate almost optimal code this way.

Add strbuf_swap:
  Exchange the values of its arguments.
  Use it in fast-import.c

Also fix spacing issues in strbuf.h

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
2007-09-20 23:17:40 -07:00
Pierre Habouzit
182af8343c Use xmemdupz() in many places.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 17:42:17 -07:00
Pierre Habouzit
0557656930 fast-import optimization:
Now that cmd_data acts on a strbuf, make last_object stashed buffer be a
strbuf as well. On new stash, don't free the last stashed buffer, rather
swap it with the one you will stash, this way, callers of store_object can
act on static strbufs, and at some point, fast-import won't allocate new
memory for objects buffers.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 00:55:25 -07:00
Pierre Habouzit
eec813cfc6 fast-import was using dbuf's, replace them with strbuf's.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 00:55:15 -07:00
Pierre Habouzit
e6c019d0b0 Drop strbuf's 'eof' marker, and make read_line a first class citizen.
read_line is now strbuf_getline, and is a first class citizen, it returns 0
when reading a line worked, EOF else.

The ->eof marker was used non-locally by fast-import.c, mimic the same
behaviour using a static int in "read_next_command", that now returns -1 on
EOF, and avoids to call strbuf_getline when it's in EOF state.

Also no longer automagically strbuf_release the buffer, it's counter
intuitive and breaks fast-import in a very subtle way.

Note: being at EOF implies that command_buf.len == 0.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 00:55:10 -07:00
Pierre Habouzit
ba3ed09728 Now that cache.h needs strbuf.h, remove useless includes.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-16 17:30:03 -07:00
Pierre Habouzit
f1696ee398 Strbuf API extensions and fixes.
* Add strbuf_rtrim to remove trailing spaces.
  * Add strbuf_insert to insert data at a given position.
  * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
    \0 so the overflow test for snprintf is the strict comparison. This is
    not critical as the growth mechanism chosen will always allocate _more_
    memory than asked, so the second test will not fail. It's some kind of
    miracle though.
  * Add size extension hints for strbuf_init and strbuf_read. If 0, default
    applies, else:
      + initial buffer has the given size for strbuf_init.
      + first growth checks it has at least this size rather than the
        default 8192.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-10 12:48:24 -07:00
Pierre Habouzit
4a241d79c9 fast-import: Use strbuf API, and simplify cmd_data()
This patch features the use of strbuf_detach, and prevent the programmer
to mess with allocation directly. The code is as efficent as before, just
more concise and more straightforward.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-06 23:57:44 -07:00
Pierre Habouzit
b449f4cfc9 Rework strbuf API and semantics.
The gory details are explained in strbuf.h. The change of semantics this
patch enforces is that the embeded buffer has always a '\0' character after
its last byte, to always make it a C-string. The offs-by-one changes are all
related to that very change.

  A strbuf can be used to store byte arrays, or as an extended string
library. The `buf' member can be passed to any C legacy string function,
because strbuf operations always ensure there is a terminating \0 at the end
of the buffer, not accounted in the `len' field of the structure.

  A strbuf can be used to generate a string/buffer whose final size is not
really known, and then "strbuf_detach" can be used to get the built buffer,
and keep the wrapping "strbuf" structure usable for further work again.

  Other interesting feature: strbuf_grow(sb, size) ensure that there is
enough allocated space in `sb' to put `size' new octets of data in the
buffer. It helps avoiding reallocating data for nothing when the problem the
strbuf helps to solve has a known typical size.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-06 23:57:44 -07:00
Alex Riesen
4bf53833db Avoid using va_copy in fast-import: it seems to be unportable.
[sp: minor change to use fputs, thus reducing the patch size]

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-20 21:57:50 -07:00
Junio C Hamano
7e5dcea831 fast-import pull request
* skip_optional_lf() decl is old-style -- please say

	static skip_optional_lf(void)
        {
        	...
	}

* t9300 #14 fails, like this:

* expecting failure: git-fast-import <input
fatal: Branch name doesn't conform to GIT standards: .badbranchname
fast-import: dumping crash report to .git/fast_import_crash_14354
./test-lib.sh: line 143: 14354 Segmentation fault      git-fast-import <input

-- >8 --
Subject: [PATCH] fastimport: Fix re-use of va_list

The va_list is designed to be used only once. The current code
reuses va_list argument may cause segmentation fault.  Copy and
release the arguments to avoid this problem.

While we are at it, fix old-style function declaration of
skip_optional_lf().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 13:11:01 -04:00
Shawn O. Pearce
904b194151 Include recent command history in fast-import crash reports
When we crash the frontend developer (or end-user) may need to know
roughly around what part of the input stream we had a problem with
and aborted on.  Because line numbers aren't very useful in this
sort of application we instead just keep the last 100 commands in
a FIFO queue and print them as part of the crash report.

Currently one problem with this design is a commit that has
more than 100 modified files in it will flood the FIFO and any
context regarding branch/from/committer/mark/comments will be lost.
We really should save only the last few (10?) file changes for the
current commit, ensuring we have some prior higher level commands
in the FIFO when we crash on a file M/D/C/R command.

Another issue with this approach is the FIFO only includes the
commands, it does not include the commit messages.  Yet having a
commit message may be useful to help locate the relevant change in
the source material.  In practice I don't think this is going to be a
major concern as the frontend can always embed its own source change
set identifier as a comment (which will appear in the crash report)
and the commit message(s) for the most recent commits of any given
branch should be obtainable from the (packed) commit objects.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:42:41 -04:00
Shawn O. Pearce
8acb3297f3 Generate crash reports on die in fast-import
As fast-import is quite strict about its input and die()'s anytime
something goes wrong it can be difficult for a frontend developer
to troubleshoot why fast-import rejected their input, or to even
determine what input command it rejected.

This change introduces a custom handler for Git's die() routine.
When we receive a die() for any reason (fast-import or a lower level
core Git routine we called) the error is first dumped onto stderr
and then a more extensive crash report file is prepared in GIT_DIR.
Finally we exit the process with status 128, just like the stock
builtin die handler.

An internal flag is set to prevent any further die()'s that may be
invoked during the crash report generator from causing us to enter
into an infinite loop.  We shouldn't die() from our crash report
handler, but just in case someone makes a future code change we are
prepared to gaurd against small mistakes turning into huge problems
for the end-user.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:42:41 -04:00
Shawn O. Pearce
ac053c0202 Allow frontends to bidirectionally communicate with fast-import
The existing checkpoint command is very useful to force fast-import
to dump the branches out to disk so that standard Git tools can
access them and the objects they refer to.  However there was not a
way to know when fast-import had finished executing the checkpoint
and it was safe to read those refs.

The progress command can be used to make fast-import output any
message of the frontend's choosing to standard out.  The frontend
can scan for these messages using select() or poll() to monitor a
pipe connected to the standard output of fast-import.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:36 -04:00
Shawn O. Pearce
1fdb649c6a Make trailing LF optional for all fast-import commands
For the same reasons as the prior change we want to allow frontends
to omit the trailing LF that usually delimits commands.  In some
cases these just make the input stream more verbose looking than
it needs to be, and its just simpler for the frontend developer to
get started if our parser is slightly more lenient about where an
LF is required and where it isn't.

To make this optional LF feature work we now have to buffer up to one
line of input in command_buf.  This buffering can happen if we look
at the current input command but don't recognize it at this point
in the code.  In such a case we need to "unget" the entire line,
but we cannot depend upon the stdio library to let us do ungetc()
for that many characters at once.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:35 -04:00
Shawn O. Pearce
2c570cde98 Make trailing LF following fast-import data commands optional
A few fast-import frontend developers have found it odd that we
require the LF following a `data` command, especially in the exact
byte count format.  Technically we don't need this LF to parse
the stream properly, but having it here does make the stream more
readable to humans.  We can easily make the LF optional by peeking
at the next byte available from the stream and pushing it back into
the buffer if its not LF.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:35 -04:00
Shawn O. Pearce
401d53fa35 Teach fast-import to ignore lines starting with '#'
Several frontend developers have asked that some form of stream
comments be permitted within a fast-import data stream.  This way
they can include information from their own frontend program about
where specific data was taken from in the source system, or about
a decision that their frontend may have made while creating the
fast-import data stream.

This change introduces comments in the Bourne-shell/Tcl/Perl style.
Lines starting with '#' are ignored, up to and including the LF.
Unlike the above mentioned three languages however we do not look for
and ignore leading whitespace.  This just simplifies the definition
of the comment format and the code that parses them.

To make comments work we had to stop using read_next_command() within
cmd_data() and directly invoke read_line() during the inline variant
of the function.  This is necessary to retain any lines of the
input data that might otherwise look like a comment to fast-import.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:35 -04:00
Shawn O. Pearce
3149007475 Use handy ALLOC_GROW macro in fast-import when possible
Instead of growing our buffer by hand during the inline variant of
cmd_data() we can save a few lines of code and just use the nifty
new ALLOC_GROW macro already available to us.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:34 -04:00
Shawn O. Pearce
ea08a6fd19 Actually allow TAG_FIXUP branches in fast-import
Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a
Git backend for cvs2svn that fast-import was barfing when he tried
to use "TAG_FIXUP" as a branch name for temporary work needed to
cleanup the tree prior to creating an annotated tag object.

The reason we were rejecting the branch name was check_ref_format()
returns -2 when there are less than 2 '/' characters in the input
name.  TAG_FIXUP has 0 '/' characters, but is technically just as
valid of a ref as HEAD and MERGE_HEAD, so we really should permit it
(and any other similar looking name) during import.

New test cases have been added to make sure we still detect very
wrong branch names (e.g. containing [ or starting with .) and yet
still permit reasonable names (e.g. TAG_FIXUP).

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:34 -04:00