Commit Graph

141 Commits

Author SHA1 Message Date
Pierre Habouzit
0557656930 fast-import optimization:
Now that cmd_data acts on a strbuf, make last_object stashed buffer be a
strbuf as well. On new stash, don't free the last stashed buffer, rather
swap it with the one you will stash, this way, callers of store_object can
act on static strbufs, and at some point, fast-import won't allocate new
memory for objects buffers.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 00:55:25 -07:00
Pierre Habouzit
eec813cfc6 fast-import was using dbuf's, replace them with strbuf's.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 00:55:15 -07:00
Pierre Habouzit
e6c019d0b0 Drop strbuf's 'eof' marker, and make read_line a first class citizen.
read_line is now strbuf_getline, and is a first class citizen, it returns 0
when reading a line worked, EOF else.

The ->eof marker was used non-locally by fast-import.c, mimic the same
behaviour using a static int in "read_next_command", that now returns -1 on
EOF, and avoids to call strbuf_getline when it's in EOF state.

Also no longer automagically strbuf_release the buffer, it's counter
intuitive and breaks fast-import in a very subtle way.

Note: being at EOF implies that command_buf.len == 0.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 00:55:10 -07:00
Pierre Habouzit
ba3ed09728 Now that cache.h needs strbuf.h, remove useless includes.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-16 17:30:03 -07:00
Pierre Habouzit
f1696ee398 Strbuf API extensions and fixes.
* Add strbuf_rtrim to remove trailing spaces.
  * Add strbuf_insert to insert data at a given position.
  * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
    \0 so the overflow test for snprintf is the strict comparison. This is
    not critical as the growth mechanism chosen will always allocate _more_
    memory than asked, so the second test will not fail. It's some kind of
    miracle though.
  * Add size extension hints for strbuf_init and strbuf_read. If 0, default
    applies, else:
      + initial buffer has the given size for strbuf_init.
      + first growth checks it has at least this size rather than the
        default 8192.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-10 12:48:24 -07:00
Pierre Habouzit
4a241d79c9 fast-import: Use strbuf API, and simplify cmd_data()
This patch features the use of strbuf_detach, and prevent the programmer
to mess with allocation directly. The code is as efficent as before, just
more concise and more straightforward.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-06 23:57:44 -07:00
Pierre Habouzit
b449f4cfc9 Rework strbuf API and semantics.
The gory details are explained in strbuf.h. The change of semantics this
patch enforces is that the embeded buffer has always a '\0' character after
its last byte, to always make it a C-string. The offs-by-one changes are all
related to that very change.

  A strbuf can be used to store byte arrays, or as an extended string
library. The `buf' member can be passed to any C legacy string function,
because strbuf operations always ensure there is a terminating \0 at the end
of the buffer, not accounted in the `len' field of the structure.

  A strbuf can be used to generate a string/buffer whose final size is not
really known, and then "strbuf_detach" can be used to get the built buffer,
and keep the wrapping "strbuf" structure usable for further work again.

  Other interesting feature: strbuf_grow(sb, size) ensure that there is
enough allocated space in `sb' to put `size' new octets of data in the
buffer. It helps avoiding reallocating data for nothing when the problem the
strbuf helps to solve has a known typical size.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-06 23:57:44 -07:00
Alex Riesen
4bf53833db Avoid using va_copy in fast-import: it seems to be unportable.
[sp: minor change to use fputs, thus reducing the patch size]

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-20 21:57:50 -07:00
Junio C Hamano
7e5dcea831 fast-import pull request
* skip_optional_lf() decl is old-style -- please say

	static skip_optional_lf(void)
        {
        	...
	}

* t9300 #14 fails, like this:

* expecting failure: git-fast-import <input
fatal: Branch name doesn't conform to GIT standards: .badbranchname
fast-import: dumping crash report to .git/fast_import_crash_14354
./test-lib.sh: line 143: 14354 Segmentation fault      git-fast-import <input

-- >8 --
Subject: [PATCH] fastimport: Fix re-use of va_list

The va_list is designed to be used only once. The current code
reuses va_list argument may cause segmentation fault.  Copy and
release the arguments to avoid this problem.

While we are at it, fix old-style function declaration of
skip_optional_lf().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 13:11:01 -04:00
Shawn O. Pearce
904b194151 Include recent command history in fast-import crash reports
When we crash the frontend developer (or end-user) may need to know
roughly around what part of the input stream we had a problem with
and aborted on.  Because line numbers aren't very useful in this
sort of application we instead just keep the last 100 commands in
a FIFO queue and print them as part of the crash report.

Currently one problem with this design is a commit that has
more than 100 modified files in it will flood the FIFO and any
context regarding branch/from/committer/mark/comments will be lost.
We really should save only the last few (10?) file changes for the
current commit, ensuring we have some prior higher level commands
in the FIFO when we crash on a file M/D/C/R command.

Another issue with this approach is the FIFO only includes the
commands, it does not include the commit messages.  Yet having a
commit message may be useful to help locate the relevant change in
the source material.  In practice I don't think this is going to be a
major concern as the frontend can always embed its own source change
set identifier as a comment (which will appear in the crash report)
and the commit message(s) for the most recent commits of any given
branch should be obtainable from the (packed) commit objects.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:42:41 -04:00
Shawn O. Pearce
8acb3297f3 Generate crash reports on die in fast-import
As fast-import is quite strict about its input and die()'s anytime
something goes wrong it can be difficult for a frontend developer
to troubleshoot why fast-import rejected their input, or to even
determine what input command it rejected.

This change introduces a custom handler for Git's die() routine.
When we receive a die() for any reason (fast-import or a lower level
core Git routine we called) the error is first dumped onto stderr
and then a more extensive crash report file is prepared in GIT_DIR.
Finally we exit the process with status 128, just like the stock
builtin die handler.

An internal flag is set to prevent any further die()'s that may be
invoked during the crash report generator from causing us to enter
into an infinite loop.  We shouldn't die() from our crash report
handler, but just in case someone makes a future code change we are
prepared to gaurd against small mistakes turning into huge problems
for the end-user.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:42:41 -04:00
Shawn O. Pearce
ac053c0202 Allow frontends to bidirectionally communicate with fast-import
The existing checkpoint command is very useful to force fast-import
to dump the branches out to disk so that standard Git tools can
access them and the objects they refer to.  However there was not a
way to know when fast-import had finished executing the checkpoint
and it was safe to read those refs.

The progress command can be used to make fast-import output any
message of the frontend's choosing to standard out.  The frontend
can scan for these messages using select() or poll() to monitor a
pipe connected to the standard output of fast-import.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:36 -04:00
Shawn O. Pearce
1fdb649c6a Make trailing LF optional for all fast-import commands
For the same reasons as the prior change we want to allow frontends
to omit the trailing LF that usually delimits commands.  In some
cases these just make the input stream more verbose looking than
it needs to be, and its just simpler for the frontend developer to
get started if our parser is slightly more lenient about where an
LF is required and where it isn't.

To make this optional LF feature work we now have to buffer up to one
line of input in command_buf.  This buffering can happen if we look
at the current input command but don't recognize it at this point
in the code.  In such a case we need to "unget" the entire line,
but we cannot depend upon the stdio library to let us do ungetc()
for that many characters at once.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:35 -04:00
Shawn O. Pearce
2c570cde98 Make trailing LF following fast-import data commands optional
A few fast-import frontend developers have found it odd that we
require the LF following a `data` command, especially in the exact
byte count format.  Technically we don't need this LF to parse
the stream properly, but having it here does make the stream more
readable to humans.  We can easily make the LF optional by peeking
at the next byte available from the stream and pushing it back into
the buffer if its not LF.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:35 -04:00
Shawn O. Pearce
401d53fa35 Teach fast-import to ignore lines starting with '#'
Several frontend developers have asked that some form of stream
comments be permitted within a fast-import data stream.  This way
they can include information from their own frontend program about
where specific data was taken from in the source system, or about
a decision that their frontend may have made while creating the
fast-import data stream.

This change introduces comments in the Bourne-shell/Tcl/Perl style.
Lines starting with '#' are ignored, up to and including the LF.
Unlike the above mentioned three languages however we do not look for
and ignore leading whitespace.  This just simplifies the definition
of the comment format and the code that parses them.

To make comments work we had to stop using read_next_command() within
cmd_data() and directly invoke read_line() during the inline variant
of the function.  This is necessary to retain any lines of the
input data that might otherwise look like a comment to fast-import.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:35 -04:00
Shawn O. Pearce
3149007475 Use handy ALLOC_GROW macro in fast-import when possible
Instead of growing our buffer by hand during the inline variant of
cmd_data() we can save a few lines of code and just use the nifty
new ALLOC_GROW macro already available to us.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:34 -04:00
Shawn O. Pearce
ea08a6fd19 Actually allow TAG_FIXUP branches in fast-import
Michael Haggerty <mhagger@alum.mit.edu> noticed while debugging a
Git backend for cvs2svn that fast-import was barfing when he tried
to use "TAG_FIXUP" as a branch name for temporary work needed to
cleanup the tree prior to creating an annotated tag object.

The reason we were rejecting the branch name was check_ref_format()
returns -2 when there are less than 2 '/' characters in the input
name.  TAG_FIXUP has 0 '/' characters, but is technically just as
valid of a ref as HEAD and MERGE_HEAD, so we really should permit it
(and any other similar looking name) during import.

New test cases have been added to make sure we still detect very
wrong branch names (e.g. containing [ or starting with .) and yet
still permit reasonable names (e.g. TAG_FIXUP).

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:34 -04:00
Alex Riesen
c905e09006 Fix whitespace in "Format of STDIN stream" of fast-import
Something probably assumed that HT indentation is 4 characters.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-08-19 03:38:34 -04:00
Luiz Fernando N. Capitulino
7647b17f1d Use xmkstemp() instead of mkstemp()
xmkstemp() performs error checking and prints a standard error message when
an error occur.

Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14 22:20:26 -07:00
Shawn O. Pearce
b6f3481bb4 Teach fast-import to recursively copy files/directories
Some source material (e.g. Subversion dump files) perform directory
renames by telling us the directory was copied, then deleted in the
same revision.  This makes it difficult for a frontend to convert
such data formats to a fast-import stream, as all the frontend has
on hand is "Copy a/ to b/; Delete a/" with no details about what
files are in a/, unless the frontend also kept track of all files.

The new 'C' subcommand within a commit allows the frontend to make a
recursive copy of one path to another path within the branch, without
needing to keep track of the individual file paths.  The metadata
copy is performed in memory efficiently, but is implemented as a
copy-immediately operation, rather than copy-on-write.

With this new 'C' subcommand frontends could obviously implement an
'R' (rename) on their own as a combination of 'C' and 'D' (delete),
but since we have already offered up 'R' in the past and it is a
trivial thing to keep implemented I'm not going to deprecate it.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-07-15 01:41:23 -04:00
Shawn O. Pearce
f39a946a1f Support wholesale directory renames in fast-import
Some source material (e.g. Subversion dump files) perform directory
renames without telling us exactly which files in that subdirectory
were moved.  This makes it hard for a frontend to convert such data
formats to a fast-import stream, as all the frontend has on hand
is "Rename a/ to b/" with no details about what files are in a/,
unless the frontend also kept track of all files.

The new 'R' subcommand within a commit allows the frontend to
rename either a file or an entire subdirectory, without needing to
know the object's SHA-1 or the specific files contained within it.
The rename is performed as efficiently as possible internally,
making it cheaper than a 'D'/'M' pair for a file rename.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-07-09 23:06:16 -04:00
Junio C Hamano
98ee8187e4 Merge branch 'maint'
* maint:
  Fix possible coredump with fast-import --import-marks
  Refactor fast-import branch creation from existing commit
  fast-import: Fix crash when referencing already existing objects
  fast-import: Fix uninitialized variable
  Documentation: fix git-config.xml generation
2007-05-23 22:37:23 -07:00
Shawn O. Pearce
aac65ed1bc Fix possible coredump with fast-import --import-marks
When e8438420bb allowed us to reload
the marks table on subsequent runs of fast-import we really broke
things, as we set pack_id to MAX_PACK_ID for any objects we imported
into the marks table.  Creating a branch from that mark should fail
as we attempt to read the object through a non-existant packed_git
pointer.  Instead we have to use the normal Git object system to
locate the older commit, as we ourselves do not have a reference
to the packed_git it resides in.

This bug only occurred because t9300 was not complete enough.
When we added the --import-marks feature we didn't actually test
its implementation enough to verify the function worked as intended.
I have corrected that, and included the changes as part of this fix.
Prior versions of fast-import fail the new test(s); this commit
allows them to pass.

Credit for this bug find goes to Simon Hausmann <simon@lst.de> as
he recently identified a similiar bug in the tree lazy-loading path.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 00:50:19 -04:00
Shawn O. Pearce
654aaa37ab Refactor fast-import branch creation from existing commit
To resolve a corner case uncovered by Simon Hausmann I need to
reuse the logic for the SHA-1 expression version of the 'from '
command within the mark version of the 'from ' command.  This change
doesn't alter any functionality, but is merely breaking the common
code out to a function that I can reuse.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 00:11:48 -04:00
Simon Hausmann
20f546a86c fast-import: Fix crash when referencing already existing objects
Commit a5c1780a03 sets the pack_id of existing
objects to MAX_PACK_ID. When the same object is referenced later again it is
found in the local object hash. With such a pack_id fast-import should not try
to locate that object in the newly created pack(s).

Signed-off-by: Simon Hausmann <simon@lst.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-23 23:36:47 -04:00
Simon Hausmann
b259157f3c fast-import: Fix uninitialized variable
Fix uninitialized last_object->no_free variable that is accessed in
store_object.

Signed-off-by: Simon Hausmann <simon@lst.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-23 23:36:47 -04:00
Sven Verdoolaege
68db31cc28 git-update-ref: add --no-deref option for overwriting/detaching ref
git-checkout is also adapted to make use of this new option
instead of the handcrafted command sequence.

Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-10 15:24:44 -07:00
Dana L. How
8b0eca7c7b Create pack-write.c for common pack writing code
Include a generalized fixup_pack_header_footer() in this new file.
Needed by git-repack --max-pack-size feature in a later patchset.

[sp: Moved close(pack_fd) to callers, to support index-pack, and
     changed name to better indicate it is for packfiles.]

Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-02 13:24:18 -04:00
Junio C Hamano
39231b1c32 Merge branch 'maint'
* maint:
  http.c: Fix problem with repeated calls of http_init
  Add missing reference to GIT_COMMITTER_DATE in git-commit-tree documentation
  Fix import-tars fix.
  Update .mailmap with "Michael"
  Do not barf on too long action description
  Catch empty pathnames in trees during fsck
  Don't allow empty pathnames in fast-import
  import-tars: be nice to wrong directory modes
  git-svn: Added 'find-rev' command
  git shortlog documentation: add long options and fix a typo
2007-04-29 01:52:43 -07:00
Shawn O. Pearce
475d1b333a Don't allow empty pathnames in fast-import
riddochc on #git noticed corruption caused by import-tars.  This
was fixed in the prior commit by Dscho, but fast-import was wrong
to have allowed a tree to be created with an empty string as the
filename.  No operating system allows this, and Git itself doesn't
accept this into the index.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-28 20:03:25 -04:00
Sami Farin
00be8dcc1a fast-import: size_t vs ssize_t
size_t is unsigned, so (n < 0) is never true.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-24 16:14:48 -04:00
Shawn O. Pearce
a5c1780a03 Don't repack existing objects in fast-import
Some users of fast-import have been trying to use it to rewrite
commits and trees, an activity where the all of the relevant blobs
are already available from the existing packfiles.  In such a case
we don't want to repack a blob, even if the frontend application
has supplied us the raw data rather than a mark or a SHA-1 name.

I'm intentionally only checking the packfiles that existed when
fast-import started and am always ignoring all loose object files.

We ignore loose objects because fast-import tends to operate on a
very large number of objects in a very short timespan, and it is
usually creating new objects, not reusing existing ones.  In such
a situtation the majority of the objects will not be found in the
existing packfiles, nor will they be loose object files.  If the
frontend application really wants us to look at loose object files,
then they can just repack the repository before running fast-import.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-20 11:23:45 -04:00
Theodore Ts'o
46efd2d93c Rename warn() to warning() to fix symbol conflicts on BSD and Mac OS
This fixes a problem reported by Randal Schwartz:

>I finally tracked down all the (albeit inconsequential) errors I was getting
>on both OpenBSD and OSX.  It's the warn() function in usage.c.  There's
>warn(3) in BSD-style distros.  It'd take a "great rename" to change it, but if
>someone with better C skills than I have could do that, my linker and I would
>appreciate it.

It was annoying to me, too, when I was doing some mergetool testing on
Mac OS X, so here's a fix.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Randal L. Schwartz" <merlyn@stonehenge.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-31 01:11:11 -07:00
Nicolas Pitre
0e55181f29 make it more obvious that temporary files are temporary files
When some operations are interrupted (or "die()'d" or crashed) then the
partial object/pack/index file may remain around.  Make it more obvious
in their name that those files are temporary stuff and can be cleaned up
if no operation is in progress.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-24 22:32:39 -07:00
Shawn O. Pearce
061e35c581 Remove unnecessary casts from fast-import
Jeff King pointed out that these casts are quite unnecessary, as
the compiler should be doing them anyway, and may cause problems
in the future if the size of the argument for to_atom were to ever
be increased.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-12 15:48:37 -04:00
Shawn O. Pearce
7f09ac4714 Merge branch 'maint'
* maint:
  fast-import: grow tree storage more aggressively
2007-03-12 15:04:46 -04:00
Jeff King
f022f85f6d fast-import: grow tree storage more aggressively
When building up a tree for a commit, fast-import
dynamically allocates memory for the tree entries. When more
space is needed, the allocated memory is increased by a
constant amount. For very large trees, this means
re-allocating and memcpy()ing the memory O(n) times.

To compound this problem, releasing the previous tree
resource does not free the memory; it is kept in a pool
for future trees. This means that each of the O(n)
allocations will consume increasing amounts of memory,
giving O(n^2) memory consumption.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-12 15:01:44 -04:00
Junio C Hamano
f45fa2a073 Merge branch 'master' of git://repo.or.cz/git/fastimport
* 'master' of git://repo.or.cz/git/fastimport:
  Allow fast-import frontends to reload the marks table
  Use atomic updates to the fast-import mark file
  Preallocate memory earlier in fast-import
2007-03-07 23:10:05 -08:00
Shawn O. Pearce
e8438420bb Allow fast-import frontends to reload the marks table
I'm giving fast-import a lesson on how to reload the marks table
using the same format it outputs with --export-marks.  This way
a frontend can reload the marks table from a prior import, making
incremental imports less painful.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07 18:07:26 -05:00
Shawn O. Pearce
60b9004cdb Use atomic updates to the fast-import mark file
When we allow fast-import frontends to reload a mark file from a
prior session we want to let them use the same file as they exported
the marks to.  This makes it very simple for the frontend to save
state across incremental imports.

But we don't want to lose the old marks table if anything goes wrong
while writing our current marks table.  So instead of truncating and
overwriting the path specified to --export-marks we use the standard
lockfile code to write the current marks out to a temporary file,
then rename it over the old marks table.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07 18:05:38 -05:00
Shawn O. Pearce
93e72d8d8f Preallocate memory earlier in fast-import
I'm about to teach fast-import how to reload the marks file created
by a prior session.  The general approach that I want to use is to
immediately parse the marks file when the specific argument is found
in argv, thereby allowing the caller to supply multiple marks files,
as the mark space can be sparsely populated.

To make that work out we need to allocate our object tables before
we parse the command line options.  Since none of these tables
depend on the command line options, we can easily relocate them.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07 17:11:02 -05:00
Shawn O. Pearce
6777a59fcd Use off_t in pack-objects/fast-import when we mean an offset
Always use an off_t value in pack-objects anytime we are dealing
with an offset to some data within a packfile.

Also fixed a minor uintmax_t that was incorrectly defined before.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:06:33 -08:00
Shawn O. Pearce
c4001d92be Use off_t when we really mean a file offset.
Not all platforms have declared 'unsigned long' to be a 64 bit value,
but we want to support a 64 bit packfile (or close enough anyway)
in the near future as some projects are getting large enough that
their packed size exceeds 4 GiB.

By using off_t, the POSIX type that is declared to mean an offset
within a file, we support whatever maximum file size the underlying
operating system will handle.  For most modern systems this is up
around 2^60 or higher.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:06:25 -08:00
Shawn O. Pearce
3a55602eec General const correctness fixes
We shouldn't attempt to assign constant strings into char*, as the
string is not writable at runtime.  Likewise we should always be
treating unsigned values as unsigned values, not as signed values.

Most of these are very straightforward.  The only exception is the
(unnecessary) xstrdup/free in builtin-branch.c for the detached
head case.  Since this is a user-level interactive type program
and that particular code path is executed no more than once, I feel
that the extra xstrdup call is well worth the easy elimination of
this warning.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 10:47:10 -08:00
Shawn O. Pearce
6b4318e604 Merge branch 'maint'
* maint:
  fast-import: Fail if a non-existant commit is used for merge
  fast-import: Avoid infinite loop after reset

[sp: Minor evil merge to deal with type_names array moving
 to be private in 'master'.]
2007-03-05 12:50:29 -05:00
Shawn O. Pearce
2f6dc35d2a fast-import: Fail if a non-existant commit is used for merge
Johannes Sixt noticed during one of his own imports that fast-import
did not fail if a non-existant commit is referenced by SHA-1 value
as an argument to the 'merge' command.  This allowed the user to
unknowingly create commits that would fail in fsck, as the commit
contents would not be completely reachable.

A side effect of this bug was that a frontend process could mark
any SHA-1 object (blob, tree, tag) as a parent of a merge commit.
This should also fail in fsck, as the commit is not a valid commit.

We now use the same rule as the 'from' command.  If a commit is
referenced in the 'merge' command by hex formatted SHA-1 then the
SHA-1 must be a commit or a tag that can be peeled back to a commit,
the commit must already exist, and must be readable by the core Git
infrastructure code.  This requirement means that the commit must
have existed prior to fast-import starting, or the commit must have
been flushed out by a prior 'checkpoint' command.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-05 12:43:14 -05:00
Shawn O. Pearce
734c91f9e2 fast-import: Avoid infinite loop after reset
Johannes Sixt noticed that a 'reset' command applied to a branch that
is already active in the branch LRU cache can cause fast-import to
relink the same branch into the LRU cache twice.  This will cause
the LRU cache to contain a cycle, making unload_one_branch run in an
infinite loop as it tries to select the oldest branch for eviction.

I have trivially fixed the problem by adding an active bit to
each branch object; this bit indicates if the branch is already
in the LRU and allows us to avoid trying to add it a second time.
Converting the pack_id field into a bitfield makes this change take
up no additional memory.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-05 12:31:09 -05:00
Nicolas Pitre
21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Nicolas Pitre
df8436622f formalize typename(), and add its reverse type_from_string()
Sometime typename() is used, sometimes type_names[] is accessed directly.
Let's enforce typename() all the time which allows for validating the
type.

Also let's add a function to go from a name to a type and use it instead
of manual memcpy() when appropriate.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Junio C Hamano
599065a3bb prefixcmp(): fix-up mechanical conversion.
Previous step converted use of strncmp() with literal string
mechanically even when the result is only used as a boolean:

    if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo")))

This step manually cleans them up to read:

    if (!prefixcmp(arg, "foo"))

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:15 -08:00