Commit Graph

205 Commits

Author SHA1 Message Date
Thomas Rast
0721c314a5 Use die_errno() instead of die() when checking syscalls
Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem.  Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:

  Function              Passes on error from
  --------              --------------------
  odb_pack_keep         open
  read_ancestry         fopen
  read_in_full          xread
  strbuf_read           xread
  strbuf_read_file      open or strbuf_read_file
  strbuf_readlink       readlink
  write_in_full         xwrite

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Thomas Rast
d824cbba02 Convert existing die(..., strerror(errno)) to die_errno()
Change calls to die(..., strerror(errno)) to use the new die_errno().

In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Junio C Hamano
36587681b4 Merge branch 'ar/unlink-err'
* ar/unlink-err:
  print unlink(2) errno in copy_or_link_directory
  replace direct calls to unlink(2) with unlink_or_warn
  Introduce an unlink(2) wrapper which gives warning if unlink failed
2009-05-18 09:01:06 -07:00
Felipe Contreras
4b25d091ba Fix a bunch of pointer declarations (codestyle)
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-01 15:17:31 -07:00
Alex Riesen
691f1a28bf replace direct calls to unlink(2) with unlink_or_warn
This helps to notice when something's going wrong, especially on
systems which lock open files.

I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
  called from such a function
- it is in a static function, returning void and the function is only
  called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-29 18:37:41 -07:00
Michael J Gruber
b18cc5a3b2 Fix more typos/spelling in comments
A few more fixes on top of the automatic spell checker generated ones.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-22 19:03:39 -07:00
Mike Ralphson
3ea3c215c0 Fix typos / spelling in comments
Signed-off-by: Mike Ralphson <mike@abacus.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-22 19:02:12 -07:00
Junio C Hamano
03a39a9184 Merge branch 'jc/shared-literally'
* jc/shared-literally:
  t1301: loosen test for forced modes
  set_shared_perm(): sometimes we know what the final mode bits should look like
  move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath
  Move chmod(foo, 0444) into move_temp_to_file()
  "core.sharedrepository = 0mode" should set, not loosen
2009-04-06 00:42:52 -07:00
Johan Herland
fb8b193670 Move chmod(foo, 0444) into move_temp_to_file()
When writing out a loose object or a pack (index), move_temp_to_file() is
called to finalize the resulting file. These files (loose files and packs)
should all have permission mode 0444 (modulo adjust_shared_perm()).
Therefore, instead of doing chmod(foo, 0444) explicitly from each callsite
(or even forgetting to chmod() at all), do the chmod() call from within
move_temp_to_file().

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-27 22:10:58 -07:00
Elijah Newren
98e1a4186a Correct missing SP characters in grammar comment at top of fast-import.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-25 16:36:16 -07:00
Benjamin Kramer
eb3a9dd327 Remove unused function scope local variables
These variables were unused and can be removed safely:

  builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
  builtin-fetch-pack.c::find_common(): len
  builtin-remote.c::mv(): symref
  diff.c::show_stats():show_stats(): total
  diffcore-break.c::should_break(): base_size
  fast-import.c::validate_raw_date(): date, sign
  fsck.c::fsck_tree(): o_sha1, sha1
  xdiff-interface.c::parse_num(): read_some

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:52:17 -08:00
Junio C Hamano
bb0cebd7d0 Merge branch 'jc/maint-1.6.0-pack-directory'
* jc/maint-1.6.0-pack-directory:
  Make sure objects/pack exists before creating a new pack
2009-02-25 14:50:05 -08:00
Junio C Hamano
6e180cdcec Make sure objects/pack exists before creating a new pack
In a repository created with git older than f49fb35 (git-init-db: create
"pack" subdirectory under objects, 2005-06-27), objects/pack/ directory is
not created upon initialization.  It was Ok because subdirectories are
created as needed inside directories init-db creates, and back then,
packfiles were recent invention.

After the said commit, new codepaths started relying on the presense of
objects/pack/ directory in the repository.  This was exacerbated with
8b4eb6b (Do not perform cross-directory renames when creating packs,
2008-09-22) that moved the location temporary pack files are created from
objects/ directory to objects/pack/ directory, because moving temporary to
the final location was done carefully with lazy leading directory creation.

Many packfile related operations in such an old repository can fail
mysteriously because of this.

This commit introduces two helper functions to make things work better.

 - odb_mkstemp() is a specialized version of mkstemp() to refactor the
   code and teach it to create leading directories as needed;

 - odb_pack_keep() refactors the code to create a ".keep" file while
   create leading directories as needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-25 14:39:42 -08:00
Junio C Hamano
ba19a808aa Drop double-semicolon in C
The worst offenders are "continue;;" and "break;;" in switch statements.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-10 22:26:37 -08:00
Junio C Hamano
fd8475d9fb Merge branch 'maint'
* maint:
  Clear the delta base cache during fast-import checkpoint
2009-02-10 21:30:45 -08:00
Junio C Hamano
9b27ea9518 Merge branch 'maint-1.6.0' into maint
* maint-1.6.0:
  Clear the delta base cache during fast-import checkpoint
2009-02-10 15:32:26 -08:00
Shawn O. Pearce
3d20c636af Clear the delta base cache during fast-import checkpoint
Otherwise we may reuse the same memory address for a totally
different "struct packed_git", and a previously cached object from
the prior occupant might be returned when trying to unpack an object
from the new pack.

Found-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-10 15:30:59 -08:00
Steffen Prohaska
2fb3f6db96 Add calls to git_extract_argv0_path() in programs that call git_config_*
Programs that use git_config need to find the global configuration.
When runtime prefix computation is enabled, this requires that
git_extract_argv0_path() is called early in the program's main().

This commit adds the necessary calls.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-26 00:26:05 -08:00
Junio C Hamano
4f8b8992ef Merge branch 'maint-1.6.0' into maint
* maint-1.6.0:
  fast-import: Cleanup mode setting.
  Git.pm: call Error::Simple() properly
2009-01-13 23:10:50 -08:00
Felipe Contreras
3d1d81eba2 fast-import: Cleanup mode setting.
"S_IFREG | mode" makes only sense for 0644 and 0755.

Even though doing (S_IFREG | mode) may not hurt when mode is any other
supported value, that is only true because S_IFREG mode bit happens to
be already on for S_IFLNK or S_IFGITLINK.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-13 22:57:12 -08:00
René Scharfe
c55fae43c4 fast-import.c: stricter strtoul check, silence compiler warning
Store the return value of strtoul() in order to avoid compiler
warnings on Ubuntu 8.10.

Also check errno after each call, which is the only way to notice
an overflow without making ULONG_MAX an illegal date.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21 01:48:26 -08:00
Junio C Hamano
efe05b019c Merge branch 'maint' to sync with GIT 1.6.0.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-19 19:35:55 -08:00
Junio C Hamano
88fbf67b78 fast-import: make tagger information optional
Even though newer Porcelain tools always record the tagger information
when creating new tags, export/import pair should be able to faithfully
reproduce ancient tag objects that lack tagger information.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
2008-12-19 19:25:06 -08:00
Junio C Hamano
90c3302173 Merge branch 'maint'
* maint:
  fast-import: close pack before unlinking it
  pager: do not dup2 stderr if it is already redirected
  git-show: do not segfault when showing a bad tag
2008-12-15 23:06:13 -08:00
Johannes Schindelin
87c8a56e4f fast-import: close pack before unlinking it
This is sort of a companion patch to 4723ee9(Close files opened by
lock_file() before unlinking.): on Windows, you cannot delete what
is still open.

This makes test 9300-fast-import pass on Windows for me; quite a few
fast-imports leave temporary packs until the test "blank lines not
necessary after other commands" actually tests for the number of files
in .git/objects/pack/, which has a few temporary packs now.

I guess that 8b4eb6b(Do not perform cross-directory renames when
creating packs) was "responsible" for the breakage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-15 23:04:48 -08:00
YONETANI Tomokazu
2fad5329f4 git-fast-import possible memory corruption problem
Internal "allocate in bulk, we will never free this memory anyway"
allocator used in fast-import had a logic to round up the size of the
requested memory block in a wrong place (it computed if the available
space is enough to fit the request first, and then carved a chunk of
memory by size rounded up to the alignment, which could go beyond the
actually available space).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-14 16:41:32 -08:00
Nicolas Pitre
9126f0091f fix openssl headers conflicting with custom SHA1 implementations
On ARM I have the following compilation errors:

    CC fast-import.o
In file included from cache.h:8,
                 from builtin.h:6,
                 from fast-import.c:142:
arm/sha1.h:14: error: conflicting types for 'SHA_CTX'
/usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here
arm/sha1.h:16: error: conflicting types for 'SHA1_Init'
/usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here
arm/sha1.h:17: error: conflicting types for 'SHA1_Update'
/usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here
arm/sha1.h:18: error: conflicting types for 'SHA1_Final'
/usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here
make: *** [fast-import.o] Error 1

This is because openssl header files are always included in
git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not
set, which somehow brings in <openssl/sha1.h> clashing with the custom
ARM version.  Compilation of git is probably broken on PPC too for the
same reason.

Turns out that the only file requiring openssl/ssl.h and openssl/err.h
is imap-send.c.  But only moving those problematic includes there
doesn't solve the issue as it also includes cache.h which brings in the
conflicting local SHA1 header file.

As suggested by Jeff King, the best solution is to rename our references
to SHA1 functions and structure to something git specific, and define those
according to the implementation used.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-02 18:06:56 -07:00
Junio C Hamano
c4275591fb Merge branch 'maint'
* maint:
  builtin-prune.c: prune temporary packs in <object_dir>/pack directory
  Do not perform cross-directory renames when creating packs
2008-09-23 02:05:35 -07:00
Petr Baudis
8b4eb6b6cd Do not perform cross-directory renames when creating packs
A comment on top of create_tmpfile() describes caveats ('can have
problems on various systems (FAT, NFS, Coda)') that should apply
in this situation as well.  This in the end did not end up solving
any of my personal problems, but it might be a useful cleanup patch
nevertheless.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-22 12:19:14 -07:00
Junio C Hamano
53b543ab82 Merge branch 'np/maint-safer-pack'
* np/maint-safer-pack:
  fixup_pack_header_footer(): use nicely aligned buffer sizes
  index-pack: use fixup_pack_header_footer()'s validation mode
  pack-objects: use fixup_pack_header_footer()'s validation mode
  improve reliability of fixup_pack_header_footer()
  pack-objects: improve returned information from write_one()
2008-09-02 17:46:48 -07:00
David Soria Parra
85e7283069 cast pid_t's to uintmax_t to improve portability
Some systems (like e.g. OpenSolaris) define pid_t as long,
therefore all our sprintf that use %i/%d cause a compiler warning
beacuse of the implicit long->int cast. To make sure that
we fit the limits, we display pids as PRIuMAX and cast them explicitly
to uintmax_t.

Signed-off-by: David Soria Parra <dsp@php.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-31 16:56:22 -07:00
Nicolas Pitre
abeb40e5aa improve reliability of fixup_pack_header_footer()
Currently, this function has the potential to read corrupted pack data
from disk and give it a valid SHA1 checksum.  Let's add the ability to
validate SHA1 checksum of existing data along the way, including before
and after any arbitrary point in the pack.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-29 21:51:27 -07:00
Alexander Gavrilov
03db4525d3 Support gitlinks in fast-import.
Currently fast-import/export cannot be used for
repositories with submodules. This patch extends
the relevant programs to make them correctly
process gitlinks.

Links can be represented by two forms of the
Modify command:

M 160000 SHA1 some/path

which sets the link target explicitly, or

M 160000 :mark some/path

where the mark refers to a commit. The latter
form can be used by importing tools to build
all submodules simultaneously in one physical
repository, and then simply fetch them apart.

Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-19 11:25:51 -07:00
Stephan Beyer
1b1dd23f2d Make usage strings dash-less
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form.  So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.

This patch makes git commands show the dash-less version.

For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.

Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-13 14:12:48 -07:00
Linus Torvalds
4c81b03e30 Make pack creation always fsync() the result
This means that we can depend on packs always being stable on disk,
simplifying a lot of the object serialization worries.  And unlike loose
objects, serializing pack creation IO isn't going to be a performance
killer.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-31 14:46:57 -07:00
Junio C Hamano
9bd81e4249 Merge branch 'js/config-cb'
* js/config-cb:
  Provide git_config with a callback-data parameter

Conflicts:

	builtin-add.c
	builtin-cat-file.c
2008-05-25 14:25:02 -07:00
Miklos Vajna
b30317819d git-fast-import: rename cmd_*() functions to parse_*()
There is a cmd_merge() function in fast-import that will conflict with
builtin-merge's cmd_merge() function. To keep it consistent, rename all
cmd_*() function to parse_*()

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-16 12:40:09 -07:00
Johannes Schindelin
ef90d6d420 Provide git_config with a callback-data parameter
git_config() only had a function parameter, but no callback data
parameter.  This assumes that all callback functions only modify
global variables.

With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-14 12:34:44 -07:00
Eyvind Bernhardsen
198724ad4e fast-import: Allow "reset" to delete a new branch without error
Creating a branch in fast-import and then resetting it without making
any further commits to it currently causes an error message at the
end of the import.

This error is triggered by cvs2svn's git backend, which uses a
temporary fixup branch when it creates tags, because the fixup branch
is reset after each tag.

This patch prevents the error, allowing "reset" to be used to delete
temporary branches.

Signed-off-by: Eyvind Bernhardsen <eyvind-git@orakel.ntnu.no>
Acked-by: Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-16 14:24:32 -07:00
Junio C Hamano
ad416ed433 Merge branch 'maint' to sync with 1.5.4.4
* maint:
  GIT 1.5.4.4
  ident.c: reword error message when the user name cannot be determined
  Fix dcommit, rebase when rewriteRoot is in use
  Really make the LF after reset in fast-import optional
2008-03-08 20:07:57 -08:00
Adeodato Simó
655e8515f2 Really make the LF after reset in fast-import optional
cmd_from() ends with a call to read_next_command(), which is needed
when using cmd_from() from commands where from is not the last element.

With reset, however, "from" is the last command, after which the flow
returns to the main loop, which calls read_next_command() again.

Because of this, always set unread_command_buf in cmd_reset_branch(),
even if cmd_from() was successful.

Add a test case for this in t9300-fast-import.sh.

Signed-off-by: Adeodato Simó <dato@net.com.org.es>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-08 10:46:10 -08:00
Jean-Luc Herren
733ee2b7a9 fast-import: exit with proper message if not a git dir
git fast-import expects to be run from an existing (possibly
empty) repository.  It was dying with a suboptimal message if that
wasn't the case.

Signed-off-by: Jean-Luc Herren <jlh@gmx.ch>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
2008-03-02 16:07:41 -08:00
Shawn O. Pearce
118805b920 Finish current packfile during fast-import crash handler
If fast-import is in the middle of crashing due to a protocol error
or something like that then it can be very useful to have the mark
table and all objects up until that point be available for a new
import to resume from.

Currently we just close the active packfile, unkeep all of our
newly created packfiles (so they can be deleted), and dump the
marks table to a temporary file.

We don't attempt to update the refs/tags that the process has in
memory as much of that data can be found in the crash report and I'm
not sure it would be the right thing to do under every type of crash.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-16 00:47:07 -08:00
Shawn O. Pearce
3b08e5b8c9 Include the fast-import marks table in crash reports
If fast-import was not run with --export-marks but we are crashing
the frontend application developer may still benefit from having
that information available to them.  We now include the marks table
as part of the crash report if --export-marks was not supplied on
the command line.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-16 00:47:07 -08:00
Shawn O. Pearce
fbc63ea694 Include annotated tags in fast-import crash reports
If annotated tags were created they exist in a different namespace
within the fast-import process' internal memory tables so we did
not export them in the inactive branch table.  Now they are written
out after the branches, in the order that they were defined by the
frontend process.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-16 00:47:07 -08:00
Shawn O. Pearce
e8b32e0610 fast-import: check return value from unpack_entry()
If the tree object we have asked for is deltafied in the packfile and
the delta did not apply correctly or was not able to be decompressed
from the packfile then we can get back NULL instead of the tree data.
This is (part of) the reason why read_sha1_file() can return NULL, so
we need to also handle it the same way.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 20:11:51 -08:00
Shawn O. Pearce
7422bac441 Document the hairy gfi_unpack_entry part of fast-import
Junio pointed out this part of fast-import wasn't very clear on
initial read, and it took some time for someone who was new to
fast-import's "dirty little tricks" to understand how this was
even working.  So a little bit of commentary in the proper place
may help future readers.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21 01:04:12 -08:00
Shawn O. Pearce
bb23fdfa6c Teach fast-import to honor pack.compression and pack.depth
We now use the configured pack.compression and pack.depth values
within fast-import, as like builtin-pack-objects fast-import is
generating a packfile for consumption by the Git tools.

We use the same behavior as builtin-pack-objects does for these
options, allowing core.compression to supply the default value
for pack.compression.

The default setting for pack.depth within fast-import is still 10
as users will generally repack fast-import generated packfiles by
`repack -f`.  A large delta depth within the fast-import packfile
can significantly slow down such a later repack.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21 01:04:10 -08:00
Jim Meyering
5a7b1b571e fast-import: Don't use a maybe-clobbered errno value
Without this change, each diagnostic could use an errno value
clobbered by the close or unlink in rollback_lock_file.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-18 13:19:37 -08:00
Shawn O. Pearce
c9ced051c3 Fix random fast-import errors when compiled with NO_MMAP
fast-import was relying on the fact that on most systems mmap() and
write() are synchronized by the filesystem's buffer cache.  We were
relying on the ability to mmap() 20 bytes beyond the current end
of the file, then later fill in those bytes with a future write()
call, then read them through the previously obtained mmap() address.

This isn't always true with some implementations of NFS, but it is
especially not true with our NO_MMAP=YesPlease build time option used
on some platforms.  If fast-import was built with NO_MMAP=YesPlease
we used the malloc()+pread() emulation and the subsequent write()
call does not update the trailing 20 bytes of a previously obtained
"mmap()" (aka malloc'd) address.

Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to
be unable to read an object header (or data) that has been unlucky
enough to be written to the packfile at a location such that it
is in the trailing 20 bytes of a window previously opened on that
same packfile.

This bug has gone unnoticed for a very long time as it is highly data
dependent.  Not only does the object have to be placed at the right
position, but it also needs to be positioned behind some other object
that has been accessed due to a branch cache invalidation.  In other
words the stars had to align just right, and if you did run into
this bug you probably should also have purchased a lottery ticket.

Fortunately the workaround is a lot easier than the bug explanation.

Before we allow unpack_entry() to read data from a pack window
that has also (possibly) been modified through write() we force
all existing windows on that packfile to be closed.  By closing
the windows we ensure that any new access via the emulated mmap()
will reread the packfile, updating to the current file content.

This comes at a slight performance degredation as we cannot reuse
previously cached windows when we update the packfile.  But it
is a fairly minor difference as the window closes happen at only
two points:

 - When the packfile is finalized and its .idx is generated:

   At this stage we are getting ready to update the refs and any
   data access into the packfile is going to be random, and is
   going after only the branch tips (to ensure they are valid).
   Our existing windows (if any) are not likely to be positioned
   at useful locations to access those final tip commits so we
   probably were closing them before anyway.

 - When the branch cache missed and we need to reload:

   At this point fast-import is getting change commands for the next
   commit and it needs to go re-read a tree object it previously
   had written out to the packfile.  What windows we had (if any)
   are not likely to cover the tree in question so we probably were
   closing them before anyway.

We do try to avoid unnecessarily closing windows in the second case
by checking to see if the packfile size has increased since the
last time we called unpack_entry() on that packfile.  If the size
has not changed then we have not written additional data, and any
existing window is still vaild.  This nicely handles the cases where
fast-import is going through a branch cache reload and needs to read
many trees at once.  During such an event we are not likely to be
updating the packfile so we do not cycle the windows between reads.

With this change in place t9301-fast-export.sh (which was broken
by c3b0dec509) finally works again.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-17 22:39:20 -08:00