Commit Graph

644 Commits

Author SHA1 Message Date
Junio C Hamano
99ebd06c18 Merge branch 'np/pack'
* np/pack: (27 commits)
  document --index-version for index-pack and pack-objects
  pack-objects: remove obsolete comments
  pack-objects: better check_object() performances
  add get_size_from_delta()
  pack-objects: make in_pack_header_size a variable of its own
  pack-objects: get rid of create_final_object_list()
  pack-objects: get rid of reuse_cached_pack
  pack-objects: clean up list sorting
  pack-objects: rework check_delta_limit usage
  pack-objects: equal objects in size should delta against newer objects
  pack-objects: optimize preferred base handling a bit
  clean up add_object_entry()
  tests for various pack index features
  use test-genrandom in tests instead of /dev/urandom
  simple random data generator for tests
  validate reused pack data with CRC when possible
  allow forcing index v2 and 64-bit offset treshold
  pack-redundant.c: learn about index v2
  show-index.c: learn about index v2
  sha1_file.c: learn about index version 2
  ...
2007-04-21 17:20:50 -07:00
Jim Meyering
61d6ed139f sscanf/strtoul: parse integers robustly
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.

Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 19:47:20 -07:00
Jim Meyering
6aead43db3 sscanf/strtoul: parse integers robustly
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.

Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 19:13:55 -07:00
Nicolas Pitre
8723f21626 make overflow test on delta base offset work regardless of variable size
This patch introduces the MSB() macro to obtain the desired number of
most significant bits from a given variable independently of the variable
type.

It is then used to better implement the overflow test on the OBJ_OFS_DELTA
base offset variable with the property of always working correctly
regardless of the type/size of that variable.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10 12:48:14 -07:00
Theodore Ts'o
46efd2d93c Rename warn() to warning() to fix symbol conflicts on BSD and Mac OS
This fixes a problem reported by Randal Schwartz:

>I finally tracked down all the (albeit inconsequential) errors I was getting
>on both OpenBSD and OSX.  It's the warn() function in usage.c.  There's
>warn(3) in BSD-style distros.  It'd take a "great rename" to change it, but if
>someone with better C skills than I have could do that, my linker and I would
>appreciate it.

It was annoying to me, too, when I was doing some mergetool testing on
Mac OS X, so here's a fix.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Randal L. Schwartz" <merlyn@stonehenge.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-31 01:11:11 -07:00
Shawn O. Pearce
dc49cd769b Cast 64 bit off_t to 32 bit size_t
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.

On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().

In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:15:26 -08:00
Shawn O. Pearce
c4001d92be Use off_t when we really mean a file offset.
Not all platforms have declared 'unsigned long' to be a 64 bit value,
but we want to support a 64 bit packfile (or close enough anyway)
in the near future as some projects are getting large enough that
their packed size exceeds 4 GiB.

By using off_t, the POSIX type that is declared to mean an offset
within a file, we support whatever maximum file size the underlying
operating system will handle.  For most modern systems this is up
around 2^60 or higher.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:06:25 -08:00
Junio C Hamano
253e772ede Merge branch 'maint'
* maint:
  Unset NO_C99_FORMAT on Cygwin.
  Fix a "pointer type missmatch" warning.
  Fix some "comparison is always true/false" warnings.
  Fix an "implicit function definition" warning.
  Fix a "label defined but unreferenced" warning.
  Document the config variable format.suffix
  git-merge: fail correctly when we cannot fast forward.
  builtin-archive: use RUN_SETUP
  Fix git-gc usage note
2007-03-03 19:47:46 -08:00
Ramsay Jones
41b200179d Fix an "implicit function definition" warning.
The function at issue being initgroups() from the <grp.h> header
file. On Cygwin, setting _XOPEN_SOURCE suppresses the definition
of initgroups(), which causes the warning while compiling daemon.c.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-03 18:55:04 -08:00
Martin Waitz
b97e911643 Support for large files on 32bit systems.
Glibc uses the same size for int and off_t by default.
In order to support large pack sizes (>2GB) we force Glibc to a 64bit off_t.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:45:09 -08:00
Junio C Hamano
cff0302c14 Add prefixcmp()
We have too many strncmp(a, b, strlen(b)).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:14 -08:00
Jason Riedy
bc6b4f52fc Add a compat/strtoumax.c for Solaris 8.
Solaris 8 was pre-c99, and they weren't willing to commit to
the strtoumax definition according to /usr/include/inttypes.h.

This adds NO_STRTOUMAX and NO_STRTOULL for ancient systems.
If NO_STRTOUMAX is defined, the routine in compat/strtoumax.c
will be used instead.  That routine passes its arguments to
strtoull unless NO_STRTOULL is defined.  If NO_STRTOULL, then
the routine uses strtoul (unsigned long).

Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Acked-by: Shawn O Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-19 18:20:30 -08:00
Junio C Hamano
5faaf24634 Make sure packedgitwindowsize is multiple of (pagesize * 2)
The next patch depends on this.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-14 13:20:41 -08:00
Jason Riedy
007e2ba659 Use inttypes.h rather than stdint.h.
Older Solaris machines lack stdint.h but have inttypes.h.
The standard has inttypes.h including stdint.h, so at worst
this pollutes the namespace a bit.

Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-26 00:03:23 -08:00
Simon 'corecode' Schubert
bb79103194 Use fixed-size integers for the on-disk pack structure.
Plain integer types without a fixed size can vary between platforms.  Even
though all common platforms use 32-bit ints, there is no guarantee that
this won't change at some point.  Furthermore, specifying an integer type
with explicit size makes the definition of structures more obvious.

Signed-off-by: Simon 'corecode' Schubert <corecode@fs.ei.tum.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-18 14:11:50 -08:00
Jason Riedy
fb9522062c Set _ALL_SOURCE for AIX, but avoid its struct list.
AIX 5.3 seems to need _ALL_SOURCE for struct addrinfo, but that
introduces a struct list in grp.h.

Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-15 22:22:24 -08:00
Linus Torvalds
9130ac1e19 Better error messages for corrupt databases
This fixes another problem that Andy's case showed: git-fsck-objects
reports nonsensical results for corrupt objects.

There were actually two independent and confusing problems:

 - when we had a zero-sized file and used map_sha1_file, mmap() would
   return EINVAL, and git-fsck-objects would report that as an insane and
   confusing error. I don't know when this was introduced, it might have
   been there forever.

 - when "parse_object()" returned NULL, fsck would say "object not found",
   which can be very confusing, since obviously the object might "exist",
   it's just unparseable because it's totally corrupt.

So this just makes "xmmap()" return NULL for a zero-sized object (which is
a valid thing pointer, exactly the same way "malloc()" can return NULL for
a zero-sized allocation). That fixes the first problem (but we could have
fixed it in the caller too - I don't personally much care whichever way it
goes, but maybe somebody should check that the NO_MMAP case does
something sane in this case too?).

And the second problem is solved by just making the error message slightly
clearer - the failure to parse an object may be because it's missing or
corrupt, not necessarily because it's not "found".

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-11 14:44:17 -08:00
Stefan-W. Hahn
6900679c2f Replacing the system call pread() with lseek()/xread()/lseek() sequence.
Using cygwin with cygwin.dll before 1.5.22 the system call pread() is buggy.
This patch introduces NO_PREAD. If NO_PREAD is set git uses a sequence of
lseek()/xread()/lseek() to emulate pread.

Signed-off-by: Stefan-W. Hahn <stefan.hahn@s-hahn.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-09 16:40:40 -08:00
Junio C Hamano
ecaebf4af1 Spell default packedgitlimit slightly differently
This is shorter and easier to read, and also makes sure the
constant expression does not overflow integer range.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-07 00:11:11 -08:00
Shawn O. Pearce
22bac0ea52 Increase packedGit{Limit,WindowSize} on 64 bit systems.
If we have a 64 bit address space we can easily afford to commit
a larger amount of virtual address space to pack file access.
So on these platforms we should increase the default settings of
core.packedGit{Limit,WindowSize} to something that will better
handle very large projects.

Thanks to Andy Whitcroft for pointing out that we can safely
increase these defaults on such systems.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-06 10:34:56 -08:00
Shawn O. Pearce
c4712e4553 Replace mmap with xmmap, better handling MAP_FAILED.
In some cases we did not even bother to check the return value of
mmap() and just assume it worked.  This is bad, because if we are
out of virtual address space the kernel returned MAP_FAILED and we
would attempt to dereference that address, segfaulting without any
real error output to the user.

We are replacing all calls to mmap() with xmmap() and moving all
MAP_FAILED checking into that single location.  If a mmap call
fails we try to release enough least-recently-used pack windows
to possibly succeed, then retry the mmap() attempt.  If we cannot
mmap even after releasing pack memory then we die() as none of our
callers have any reasonable recovery strategy for a failed mmap.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
97bfeb34df Release pack windows before reporting out of memory.
If we are about to fail because this process has run out of memory we
should first try to automatically control our appetite for address
space by releasing enough least-recently-used pack windows to gain
back enough memory such that we might actually be able to meet the
current allocation request.

This should help users who have fairly large repositories but are
working on systems with relatively small virtual address space.
Many times we see reports on the mailing list of these users running
out of memory during various Git operations.  Dynamically decreasing
the amount of pack memory used when the demand for heap memory is
increasing is an intelligent solution to this problem.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
8c82534d89 Default core.packdGitWindowSize to 1 MiB if NO_MMAP.
If the compiler has asked us to disable use of mmap() on their
platform then we are forced to use git_mmap and its emulation via
pread.  In this case large (e.g. 32 MiB) windows for pack access
are simply too big as a command will wind up reading a lot more
data than it will ever need, significantly reducing response time.

To prevent a high latency when NO_MMAP has been selected we now
use a default of 1 MiB for core.packedGitWindowSize.  Credit goes
to Linus and Junio for recommending this more reasonable setting.

[jc: upcased the name of the symbolic constant, and made another
 hardcoded constant into a symbolic constant while at it. ]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
d6779124b9 Rename gitfakemmap to git_mmap.
This minor cleanup was suggested by Johannes Schindelin.

The mmap is still fake in the sense that we don't support PROT_WRITE
or MAP_SHARED with external modification at all, but that hasn't
stopped us from using mmap() thoughout the Git code.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-24 00:16:56 -08:00
Junio C Hamano
95ca1c6cb7 Really fix headers for __FreeBSD__
The symbol to detect FreeBSD is __FreeBSD__, not __FreeBSD.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-22 22:46:11 -08:00
Shawn O. Pearce
fa39b6b5b1 Introduce a global level warn() function.
Like the existing error() function the new warn() function can be
used to describe a situation that probably should not be occuring,
but which the user (and Git) can continue to work around without
running into too many problems.

An example situation is a bad commit SHA1 found in a reflog.
Attempting to read this record out of the reflog isn't really an
error as we have skipped over it in the past.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-21 22:59:34 -08:00
Junio C Hamano
a01c9c28a5 _XOPEN_SOURCE problem also exists on FreeBSD
Suggested by Rocco Rutte, Marco Roeland and others.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-21 22:56:26 -08:00
Terje Sten Bjerkseth
c902c9a608 Fix system header problems on Mac OS X
For Mac OS X 10.4, _XOPEN_SOURCE defines _POSIX_C_SOURCE which
hides many symbols from the program.

Breakage noticed and initial analysis provided by Randal
L. Schwartz.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20 17:59:22 -08:00
Junio C Hamano
85023577a8 simplify inclusion of system header files.
This is a mechanical clean-up of the way *.c files include
system header files.

 (1) sources under compat/, platform sha-1 implementations, and
     xdelta code are exempt from the following rules;

 (2) the first #include must be "git-compat-util.h" or one of
     our own header file that includes it first (e.g. config.h,
     builtin.h, pkt-line.h);

 (3) system headers that are included in "git-compat-util.h"
     need not be included in individual C source files.

 (4) "git-compat-util.h" does not have to include subsystem
     specific header files (e.g. expat.h).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20 09:51:35 -08:00
Junio C Hamano
d0c2449f78 Define fallback PATH_MAX on systems that do not define one in <limits.h>
Notably on GNU/Hurd, as reported by Gerrit Pape.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-15 22:47:21 -07:00
Shawn Pearce
9befac470b Replace uses of strdup with xstrdup.
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.

I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.

[jc: removed the part that deals with last_XXX, which I am
 finding more and more dubious these days.]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-02 03:24:37 -07:00
Jonas Fonseca
095c424d08 Use PATH_MAX instead of MAXPATHLEN
According to sys/paramh.h it's a "BSD name" for values defined in
<limits.h>. Besides PATH_MAX seems to be more commonly used.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 17:52:58 -07:00
Rene Scharfe
5bb1cda5f7 drop length argument of has_extension
As Fredrik points out the current interface of has_extension() is
potentially confusing.  Its parameters include both a nul-terminated
string and a length-limited string.

This patch drops the length argument, requiring two nul-terminated
strings; all callsites are updated.  I checked that all of them indeed
provide nul-terminated strings.  Filenames need to be nul-terminated
anyway if they are to be passed to open() etc.  The performance penalty
of the additional strlen() is negligible compared to the system calls
which inevitably surround has_extension() calls.

Additionally, change has_extension() to use size_t inside instead of
int, as that is the exact type strlen() returns and memcmp() expects.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-11 16:06:34 -07:00
Rene Scharfe
83a2b841d6 Add has_extension()
The little helper has_extension() documents through its name what we are
trying to do and makes sure we don't forget the underrun check.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-10 14:13:53 -07:00
Junio C Hamano
aa5481c1af debugging: XMALLOC_POISON
Compile with -DXMALLOC_POISON=1 to catch errors from using uninitialized
memory returned by xmalloc.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-08 12:24:13 -07:00
Peter Eriksen
817151e61a Rename safe_strncpy() to strlcpy().
This cleans up the use of safe_strncpy() even more.  Since it has the
same semantics as strlcpy() use this name instead.  Also move the
definition from inside path.c to its own file compat/strlcpy.c, and use
it conditionally at compile time, since some platforms already has
strlcpy().  It's included in the same way as compat/setenv.c.

Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 23:16:25 -07:00
Petr Baudis
39a3f5ea7c Customizable error handlers
This patch makes the usage(), die() and error() handlers customizable.
Nothing in the git code itself uses that but many other libgit users
(like Git.pm) will.

This is implemented using the mutator functions primarily because you
cannot directly modifying global variables of libgit from a program that
dlopen()ed it, apparently. But having functions for that is a better API
anyway.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 00:12:52 -07:00
Junio C Hamano
b4f2a6ac92 Use #define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-09 11:58:05 -08:00
Jason Riedy
731043fd4d Add compat/unsetenv.c .
Implement a (slow) unsetenv() for older systems.

Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-25 15:10:39 -08:00
Junio C Hamano
8f1d2e6f49 [PATCH] Compilation: zero-length array declaration.
ISO C99 (and GCC 3.x or later) lets you write a flexible array
at the end of a structure, like this:

	struct frotz {
		int xyzzy;
		char nitfol[]; /* more */
	};

GCC 2.95 and 2.96 let you to do this with "char nitfol[0]";
unfortunately this is not allowed by ISO C90.

This declares such construct like this:

	struct frotz {
		int xyzzy;
		char nitfol[FLEX_ARRAY]; /* more */
	};

and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and
empty for others.

If you are using a C90 C compiler, you should be able
to override this with CFLAGS=-DFLEX_ARRAY=1 from the
command line of "make".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-07 10:51:06 -08:00
Junio C Hamano
4e7a2eccc2 ?alloc: do not return NULL when asked for zero bytes
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-29 01:33:40 -08:00
Junio C Hamano
1c15afb934 xread/xwrite: do not worry about EINTR at calling sites.
We had errno==EINTR check after read(2)/write(2) sprinkled all
over the places, always doing continue.  Consolidate them into
xread()/xwrite() wrapper routines.

Credits for suggestion goes to HPA -- bugs are mine.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-19 18:28:16 -08:00
Martin Atukunda
252fef7149 define MAXPATHLEN for hosts that don't support it
[jc: Martin says syllable (www.syllable.org) wants this.]

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-14 14:26:44 -08:00
Junio C Hamano
4050c0df8e Clean up compatibility definitions.
This attempts to clean up the way various compatibility
functions are defined and used.

 - A new header file, git-compat-util.h, is introduced.  This
   looks at various NO_XXX and does necessary function name
   replacements, equivalent of -Dstrcasestr=gitstrcasestr in the
   Makefile.

 - Those function name replacements are removed from the Makefile.

 - Common features such as usage(), die(), xmalloc() are moved
   from cache.h to git-compat-util.h; cache.h includes
   git-compat-util.h itself.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-05 15:50:29 -08:00