git-commit-vandalism

Author	SHA1	Message	Date
Joey Hess	cbacbf4e55	sha1_file: avoid bogus "file exists" error message This avoids the following misleading error message: error: unable to create temporary sha1 filename ./objects/15: File exists mkstemp can fail for many reasons, one of which, ENOENT, can occur if the directory for the temp file doesn't exist. create_tmpfile tried to handle this case by always trying to mkdir the directory, even if it already existed. This caused errno to be clobbered, so one cannot tell why mkstemp really failed, and it truncated the buffer to just the directory name, resulting in the strange error message shown above. Note that in both occasions that I've seen this failure, it has not been due to a missing directory, or bad permissions, but some other, unknown mkstemp failure mode that did not occur when I ran git again. This code could perhaps be made more robust by retrying mkstemp, in case it was a transient failure. Signed-off-by: Joey Hess <joey@kitenet.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-23 19:44:19 -08:00
Alex Riesen	f755bb996b	Fix handle leak in sha1_file/unpack_objects if there were damaged object data In the case of bad packed object CRC, unuse_pack wasn't called after check_pack_crc which calls use_pack. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-23 19:31:05 -08:00
Junio C Hamano	47a792539a	Merge branch 'jk/commit-v-strip' * jk/commit-v-strip: status: show "-v" diff even for initial commit wt-status: refactor initial commit printing define empty tree sha1 as a macro	2008-11-16 00:48:59 -08:00
Junio C Hamano	7b51b77dbc	Merge branch 'np/pack-safer' * np/pack-safer: t5303: fix printf format string for portability t5303: work around printf breakage in dash pack-objects: don't leak pack window reference when splitting packs extend test coverage for latest pack corruption resilience improvements pack-objects: allow "fixing" a corrupted pack without a full repack make find_pack_revindex() aware of the nasty world make check_object() resilient to pack corruptions make packed_object_info() resilient to pack corruptions make unpack_object_header() non fatal better validation on delta base object offsets close another possibility for propagating pack corruption	2008-11-12 22:26:35 -08:00
Junio C Hamano	ecbbfb15a4	Merge branch 'bc/maint-keep-pack' * bc/maint-keep-pack: t7700: test that 'repack -a' packs alternate packed objects pack-objects: extend --local to mean ignore non-local loose objects too sha1_file.c: split has_loose_object() into local and non-local counterparts t7700: demonstrate mishandling of loose objects in an alternate ODB builtin-gc.c: use new pack_keep bitfield to detect .keep file existence repack: do not fall back to incremental repacking with [-a\|-A] repack: don't repack local objects in packs with .keep file pack-objects: new option --honor-pack-keep packed_git: convert pack_local flag into a bitfield and add pack_keep t7700: demonstrate mishandling of objects in packs with a .keep file	2008-11-12 22:00:43 -08:00
Jeff King	14d9c57896	define empty tree sha1 as a macro This can potentially be used in a few places, so let's make it available to all parts of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-12 12:52:21 -08:00
Brandon Casey	0f4dc14ac4	sha1_file.c: split has_loose_object() into local and non-local counterparts Signed-off-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-12 10:29:22 -08:00
Brandon Casey	8d25931d6f	packed_git: convert pack_local flag into a bitfield and add pack_keep pack_keep will be set when a pack file has an associated .keep file. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-12 10:28:08 -08:00
Nicolas Pitre	08698b1e32	make find_pack_revindex() aware of the nasty world It currently calls die() whenever given offset is not found thinking that such thing should never happen. But this offset may come from a corrupted pack whych _could_ happen and not be found. Callers should deal with this possibility gracefully instead. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:35 -08:00
Nicolas Pitre	3d77d8774f	make packed_object_info() resilient to pack corruptions In the same spirit as commit `8eca0b47ff`, let's try to survive a pack corruption by making packed_object_info() able to fall back to alternate packs or loose objects. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:35 -08:00
Nicolas Pitre	09ded04b7e	make unpack_object_header() non fatal It is possible to have pack corruption in the object header. Currently unpack_object_header() simply die() on them instead of letting the caller deal with that gracefully. So let's have unpack_object_header() return an error instead, and find a better name for unpack_object_header_gently() in that context. All callers of unpack_object_header() are ready for it. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:34 -08:00
Nicolas Pitre	d8f325563d	better validation on delta base object offsets In one case, it was possible to have a bad offset equal to 0 effectively pointing a delta onto itself and crashing git after too many recursions. In the other cases, a negative offset could result due to off_t being signed. Catch those. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:34 -08:00
Nicolas Pitre	0e8189e270	close another possibility for propagating pack corruption Abstract -------- With index v2 we have a per object CRC to allow quick and safe reuse of pack data when repacking. This, however, doesn't currently prevent a stealth corruption from being propagated into a new pack when _not_ reusing pack data as demonstrated by the modification to t5302 included here. The Context ----------- The Git database is all checksummed with SHA1 hashes. Any kind of corruption can be confirmed by verifying this per object hash against corresponding data. However this can be costly to perform systematically and therefore this check is often not performed at run time when accessing the object database. First, the loose object format is entirely compressed with zlib which already provide a CRC verification of its own when inflating data. Any disk corruption would be caught already in this case. Then, packed objects are also compressed with zlib but only for their actual payload. The object headers and delta base references are not deflated for obvious performance reasons, however this leave them vulnerable to potentially undetected disk corruptions. Object types are often validated against the expected type when they're requested, and deflated size must always match the size recorded in the object header, so those cases are pretty much covered as well. Where corruptions could go unnoticed is in the delta base reference. Of course, in the OBJ_REF_DELTA case, the odds for a SHA1 reference to get corrupted so it actually matches the SHA1 of another object with the same size (the delta header stores the expected size of the base object to apply against) are virtually zero. In the OBJ_OFS_DELTA case, the reference is a pack offset which would have to match the start boundary of a different base object but still with the same size, and although this is relatively much more "probable" than in the OBJ_REF_DELTA case, the probability is also about zero in absolute terms. Still, the possibility exists as demonstrated in t5302 and is certainly greater than a SHA1 collision, especially in the OBJ_OFS_DELTA case which is now the default when repacking. Again, repacking by reusing existing pack data is OK since the per object CRC provided by index v2 guards against any such corruptions. What t5302 failed to test is a full repack in such case. The Solution ------------ As unlikely as this kind of stealth corruption can be in practice, it certainly isn't acceptable to propagate it into a freshly created pack. But, because this is so unlikely, we don't want to pay the run time cost associated with extra validation checks all the time either. Furthermore, consequences of such corruption in anything but repacking should be rather visible, and even if it could be quite unpleasant, it still has far less severe consequences than actively creating bad packs. So the best compromize is to check packed object CRC when unpacking objects, and only during the compression/writing phase of a repack, and only when not streaming the result. The cost of this is minimal (less than 1% CPU time), and visible only with a full repack. Someone with a stats background could provide an objective evaluation of this, but I suspect that it's bad RAM that has more potential for data corruptions at this point, even in those cases where this extra check is not performed. Still, it is best to prevent a known hole for corruption when recreating object data into a new pack. What about the streamed pack case? Well, any client receiving a pack must always consider that pack as untrusty and perform full validation anyway, hence no such stealth corruption could be propagated to remote repositoryes already. It is therefore worthless doing local validation in that case. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:15 -08:00
Junio C Hamano	581000a419	Merge branch 'jc/maint-co-track' into maint * jc/maint-co-track: Enhance hold_lock_file_for_{update,append}() API demonstrate breakage of detached checkout with symbolic link HEAD Fix "checkout --track -b newbranch" on detached HEAD	2008-11-02 13:36:14 -08:00
Junio C Hamano	a157400c97	Merge branch 'jc/maint-co-track' * jc/maint-co-track: Enhance hold_lock_file_for_{update,append}() API demonstrate breakage of detached checkout with symbolic link HEAD Fix "checkout --track -b newbranch" on detached HEAD Conflicts: builtin-commit.c	2008-10-21 17:58:11 -07:00
Junio C Hamano	acd3b9eca8	Enhance hold_lock_file_for_{update,append}() API This changes the "die_on_error" boolean parameter to a mere "flags", and changes the existing callers of hold_lock_file_for_update/append() functions to pass LOCK_DIE_ON_ERROR. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-10-19 12:35:37 -07:00
Junio C Hamano	58e0fa5416	Merge branch 'maint' * maint: Hopefully the final draft release notes update before 1.6.0.3 diff(1): clarify what "T"ypechange status means contrib: update packinfo.pl to not use dashed commands force_object_loose: Fix memory leak tests: shell negation portability fix	2008-10-18 08:26:44 -07:00
Björn Steinbrink	1fb23e6550	force_object_loose: Fix memory leak read_packed_sha1 expectes its caller to free the buffer it returns, which force_object_loose didn't do. This leak is eventually triggered by "git gc", when it is manually invoked or there are too many packs around, making gc totally unusable when there are lots of unreachable objects. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-10-18 06:19:06 -07:00
Brandon Casey	f285a2d7ed	Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer Many call sites use strbuf_init(&foo, 0) to initialize local strbuf variable "foo" which has not been accessed since its declaration. These can be replaced with a static initialization using the STRBUF_INIT macro which is just as readable, saves a function call, and takes up fewer lines. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-12 12:36:19 -07:00
Miklos Vajna	749bc58c5e	Cleanup in sha1_file.c::cache_or_unpack_entry() This patch just removes an unnecessary goto which makes the code easier to read and shorter. Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-09 08:55:42 -07:00
Nicolas Pitre	9126f0091f	fix openssl headers conflicting with custom SHA1 implementations On ARM I have the following compilation errors: CC fast-import.o In file included from cache.h:8, from builtin.h:6, from fast-import.c:142: arm/sha1.h:14: error: conflicting types for 'SHA_CTX' /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here arm/sha1.h:16: error: conflicting types for 'SHA1_Init' /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here arm/sha1.h:17: error: conflicting types for 'SHA1_Update' /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here arm/sha1.h:18: error: conflicting types for 'SHA1_Final' /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here make: *** [fast-import.o] Error 1 This is because openssl header files are always included in git-compat-util.h since commit `684ec6c63c` whenever NO_OPENSSL is not set, which somehow brings in <openssl/sha1.h> clashing with the custom ARM version. Compilation of git is probably broken on PPC too for the same reason. Turns out that the only file requiring openssl/ssl.h and openssl/err.h is imap-send.c. But only moving those problematic includes there doesn't solve the issue as it also includes cache.h which brings in the conflicting local SHA1 header file. As suggested by Jeff King, the best solution is to rename our references to SHA1 functions and structure to something git specific, and define those according to the implementation used. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-02 18:06:56 -07:00
Shawn O. Pearce	1ad6d46235	Merge branch 'jc/alternate-push' * jc/alternate-push: push: receiver end advertises refs from alternate repositories push: prepare sender to receive extended ref information from the receiver receive-pack: make it a builtin is_directory(): a generic helper function	2008-09-25 09:39:24 -07:00
Shawn O. Pearce	58245a5e40	Merge branch 'jc/safe-c-l-d' * jc/safe-c-l-d: safe_create_leading_directories(): make it about "leading" directories	2008-09-25 08:50:01 -07:00
Junio C Hamano	3791f77c28	Merge branch 'maint' * maint: sha1_file: link() returns -1 on failure, not errno Make git archive respect core.autocrlf when creating zip format archives Add new test to demonstrate git archive core.autocrlf inconsistency gitweb: avoid warnings for commits without body Clarified gitattributes documentation regarding custom hunk header. git-svn: fix handling of even funkier branch names git-svn: Always create a new RA when calling do_switch for svn:// git-svn: factor out svnserve test code for later use diff/diff-files: do not use --cc too aggressively	2008-09-18 20:30:12 -07:00
Thomas Rast	e32c0a9c38	sha1_file: link() returns -1 on failure, not errno `5723fe7` (Avoid cross-directory renames and linking on object creation, 2008-06-14) changed the call to use link() directly instead of through a custom wrapper, but forgot that it returns 0 or -1, not 0 or errno. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-18 19:51:13 -07:00
Junio C Hamano	d79796bcf0	push: receiver end advertises refs from alternate repositories Earlier, when pushing into a repository that borrows from alternate object stores, we followed the longstanding design decision not to trust refs in the alternate repository that houses the object store we are borrowing from. If your public repository is borrowing from Linus's public repository, you pushed into it long time ago, and now when you try to push your updated history that is in sync with more recent history from Linus, you will end up sending not just your own development, but also the changes you acquired through Linus's tree, even though the objects needed for the latter already exists at the receiving end. This is because the receiving end does not advertise that the objects only reachable from the borrowed repository (i.e. Linus's) are already available there. This solves the issue by making the receiving end advertise refs from borrowed repositories. They are not sent with their true names but with a phoney name ".have" to make sure that the old senders will safely ignore them (otherwise, the old senders will misbehave, trying to push matching refs, and mirror push that deletes refs that only exist at the receiving end). Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-09 09:27:46 -07:00
Junio C Hamano	90b4a71c49	is_directory(): a generic helper function A simple "grep -e stat --and -e S_ISDIR" revealed there are many open-coded implementations of this function. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-09 09:27:45 -07:00
Junio C Hamano	5f0bdf50c2	safe_create_leading_directories(): make it about "leading" directories We used to allow callers to pass "foo/bar/" to make sure both "foo" and "foo/bar" exist and have good permissions, but this interface is too error prone. If a caller mistakenly passes a path with trailing slashes (perhaps it forgot to verify the user input) even when it wants to later mkdir "bar" itself, it will find that it cannot mkdir "bar". If such a caller does not bother to check the error for EEXIST, it may even errorneously die(). Because we have no existing callers to use that obscure feature, this patch removes it to avoid confusion. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-03 22:35:32 -07:00
Junio C Hamano	5a1e8707a6	Merge branch 'np/verify-pack' * np/verify-pack: discard revindex data when pack list changes	2008-08-27 16:39:46 -07:00
Nicolas Pitre	4b480c6716	discard revindex data when pack list changes This is needed to fix verify-pack -v with multiple pack arguments. Also, in theory, revindex data (if any) must be discarded whenever reprepare_packed_git() is called. In practice this is hard to trigger though. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-08-22 22:00:22 -07:00
Junio C Hamano	d8eec50468	Merge branch 'dp/hash-literally' * dp/hash-literally: add --no-filters option to git hash-object add --path option to git hash-object use parse_options() in git hash-object correct usage help string for git-hash-object correct argument checking test for git hash-object teach index_fd to work with pipes	2008-08-19 21:43:25 -07:00
Steven Grimm	ddd63e64e4	Optimize sha1_object_info for loose objects, not concurrent repacks When dealing with a repository with lots of loose objects, sha1_object_info would rescan the packs directory every time an unpacked object was referenced before finally giving up and looking for the loose object. This caused a lot of extra unnecessary system calls during git pack-objects; the code was rereading the entire pack directory once for each loose object file. This patch looks for a loose object before falling back to rescanning the pack directory, rather than the other way around. Signed-off-by: Steven Grimm <koreth@midwinter.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-08-05 21:21:20 -07:00
Dmitry Potapov	43df4f86e0	teach index_fd to work with pipes index_fd can now work with file descriptors that are not normal files but any readable file. If the given file descriptor is a regular file then mmap() is used; for other files, strbuf_read is used. The path parameter, which has been used as hint for filters, can be NULL now to indicate that the file should be hashed literally without any filter. The index_pipe function is removed as redundant. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-08-03 13:14:35 -07:00
Nicolas Pitre	ac9391093f	restore legacy behavior for read_sha1_file() Since commit `8eca0b47ff`, it is possible for read_sha1_file() to return NULL even with existing objects when they are corrupted. Previously a corrupted object would have terminated the program immediately, effectively making read_sha1_file() return NULL only when specified object is not found. Let's restore this behavior for all users of read_sha1_file() and provide a separate function with the ability to not terminate when bad objects are encountered. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-07-14 23:35:32 -07:00
Junio C Hamano	948e7471e0	Merge branch 'sp/maint-pack-memuse' * sp/maint-pack-memuse: Correct pack memory leak causing git gc to try to exceed ulimit Conflicts: sha1_file.c	2008-07-09 14:46:46 -07:00
Shawn O. Pearce	eac12e2d4d	Correct pack memory leak causing git gc to try to exceed ulimit When recursing to unpack a delta base we must unuse_pack() so that the pack window for the current object does not remain pinned in memory while the delta base is itself being unpacked and materialized for our use. On a long delta chain of 50 objects we may need to access 6 different windows from a very large (>3G) pack file in order to obtain all of the delta base content. If the process ulimit permits us to map/allocate only 1.5G we must release windows during this recursion to ensure we stay within the ulimit and transition memory from pack cache to standard malloc, or other mmap needs. Inserting an unuse_pack() call prior to the recursion allows us to avoid pinning the current window, making it available for garbage collection if memory runs low. This has been broken since at least before 1.5.1-rc1, and very likely earlier than that. Its fixed now. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-07-09 14:45:42 -07:00
Ramsay Jones	6e1c23442a	Fix some warnings (on cygwin) to allow -Werror When printing valuds of type uint32_t, we should use PRIu32, and should not assume that it is unsigned int. On 32-bit platforms, it could be defined as unsigned long. The same caution applies to ntohl(). Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-07-05 17:26:29 -07:00
Junio C Hamano	bb1ab2db08	Merge branch 'j6t/mingw' * j6t/mingw: (38 commits) compat/pread.c: Add a forward declaration to fix a warning Windows: Fix ntohl() related warnings about printf formatting Windows: TMP and TEMP environment variables specify a temporary directory. Windows: Make 'git help -a' work. Windows: Work around an oddity when a pipe with no reader is written to. Windows: Make the pager work. When installing, be prepared that template_dir may be relative. Windows: Use a relative default template_dir and ETC_GITCONFIG Windows: Compute the fallback for exec_path from the program invocation. Turn builtin_exec_path into a function. Windows: Use a customized struct stat that also has the st_blocks member. Windows: Add a custom implementation for utime(). Windows: Add a new lstat and fstat implementation based on Win32 API. Windows: Implement a custom spawnve(). Windows: Implement wrappers for gethostbyname(), socket(), and connect(). Windows: Work around incompatible sort and find. Windows: Implement asynchronous functions as threads. Windows: Disambiguate DOS style paths from SSH URLs. Windows: A rudimentary poll() emulation. Windows: Implement start_command(). ...	2008-07-02 21:57:52 -07:00
Junio C Hamano	abf7e0df17	Merge branch 'lt/config-fsync' * lt/config-fsync: Add config option to enable 'fsync()' of object files Split up default "i18n" and "branch" config parsing into helper routines Split up default "user" config parsing into helper routine Split up default "core" config parsing into helper routine	2008-06-25 13:19:49 -07:00
Jeff King	2beebd22f4	clone: create intermediate directories of destination repo The shell version used to use "mkdir -p" to create the repo path, but the C version just calls "mkdir". Let's replicate the old behavior. We have to create the git and worktree leading dirs separately; while most of the time, the worktree dir contains the git dir (as .git), the user can override this using GIT_WORK_TREE. We can reuse safe_create_leading_directories, but we need to make a copy of our const buffer to do so. Since merge-recursive uses the same pattern, we can factor this out into a global function. This has two other cleanup advantages for merge-recursive: 1. mkdir_p wasn't a very good name. "mkdir -p foo/bar" actually creates bar, but this function just creates the leading directories. 2. mkdir_p took a mode argument, but it was completely ignored. Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-25 11:44:15 -07:00
Nicolas Pitre	99093238bb	optimize verify-pack a bit Using find_pack_entry_one() to get object offsets is rather suboptimal when nth_packed_object_offset() can be used directly. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-24 23:58:57 -07:00
Jeff King	8e21d63b02	clone: create intermediate directories of destination repo The shell version used to use "mkdir -p" to create the repo path, but the C version just calls "mkdir". Let's replicate the old behavior. We have to create the git and worktree leading dirs separately; while most of the time, the worktree dir contains the git dir (as .git), the user can override this using GIT_WORK_TREE. We can reuse safe_create_leading_directories, but we need to make a copy of our const buffer to do so. Since merge-recursive uses the same pattern, we can factor this out into a global function. This has two other cleanup advantages for merge-recursive: 1. mkdir_p wasn't a very good name. "mkdir -p foo/bar" actually creates bar, but this function just creates the leading directories. 2. mkdir_p took a mode argument, but it was completely ignored. Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-24 23:23:21 -07:00
Nicolas Pitre	27d69a465d	refactor pack structure allocation New pack structures are currently allocated in 2 different places and all members have to be initialized explicitly. This is prone to errors leading to segmentation faults as found by Teemu Likonen. Let's have a common place where this structure is allocated, and have all members explicitly initialized to zero. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-24 17:03:44 -07:00
Nicolas Pitre	8eca0b47ff	implement some resilience against pack corruptions We should be able to fall back to loose objects or alternative packs when a pack becomes corrupted. This is especially true when an object exists in one pack only as a delta but its base object is corrupted. Currently there is no way to retrieve the former object even if the later is available in another pack or loose. This patch allows for a delta to be resolved (with a performance cost) using a base object from a source other than the pack where that delta is located. Same thing for non-delta objects: rather than failing outright, a search is made in other packs or used loose when the currently active pack has it but corrupted. Of course git will become extremely noisy with error messages when that happens. However, if the operation succeeds nevertheless, a simple 'git repack -a -f -d' will "fix" the corrupted repository given that all corrupted objects have a good duplicate somewhere in the object store, possibly manually copied from another source. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-23 21:29:33 -07:00
Patrick Higgins	6ff6af62ec	Workaround for AIX mkstemp() The AIX mkstemp will modify it's template parameter to an empty string if the call fails. This caused a subsequent mkdir to fail. Signed-off-by: Patrick Higgins <patrick.higgins@cexp.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-23 16:13:38 -07:00
Johannes Sixt	8385abfda5	Windows: Handle absolute paths in safe_create_leading_directories(). In this function we must be careful to handle drive-local paths else there is a danger that it runs into an infinite loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>	2008-06-23 13:30:27 +02:00
Johannes Sixt	80ba074f41	Windows: Use the Windows style PATH separator ';'. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>	2008-06-22 11:32:45 +02:00
Linus Torvalds	aafe9fbaf4	Add config option to enable 'fsync()' of object files As explained in the documentation[] this is totally useless on filesystems that do ordered/journalled data writes, but it can be a useful safety feature on filesystems like HFS+ that only journal the metadata, not the actual file contents. It defaults to off, although we could presumably in theory some day auto-enable it on a per-filesystem basis. [] Yes, I updated the docs for the thing. Hell really _has_ frozen over, and the four horsemen are probably just beyond the horizon. EVERYBODY PANIC! Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-18 16:50:35 -07:00
Junio C Hamano	79c6dca413	sha1_file.c: simplify parse_pack_index() It was implemented as a thin wrapper around an otherwise unused helper function parse_pack_index_file(). The code becomes simpler and easier to read by consolidating the two. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-16 22:19:00 -07:00
Junio C Hamano	3bfaf01857	create_tempfile: make sure that leading directories can be accessible by peers In a shared repository, we should make sure adjust_shared_perm() is called after creating the initial fan-out directories under objects/ directory. Earlier an logico called the function only when mkdir() failed; we should do so when mkdir() succeeded. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-16 22:02:12 -07:00
Linus Torvalds	1421c5f274	write_loose_object: don't bother trying to read an old object Before even calling this, all callers have done a "has_sha1_file(sha1)" or "has_loose_object(sha1)" check, so there is no point in doing a second check. If something races with us on object creation, we handle that in the final link() that moves it to the right place. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-16 21:46:47 -07:00
Linus Torvalds	c529d75a75	Simplify and rename find_sha1_file() Now that we've made the loose SHA1 file reading more careful and streamlined, we only use the old find_sha1_file() function for checking whether a loose object file exists at all. As such, the whole 'return stat information' part of it was just pointless (nobody cares any more), and the naming of the function is not really all that relevant either. So simplify it to not do a 'stat()', but just an existence check (which is what the callers want), and rename it to 'has_loose_object()' which matches the use. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-14 14:39:22 -07:00
Linus Torvalds	44d1c19ee8	Make loose object file reading more careful We used to do 'stat()+open()+mmap()+close()' to read the loose object file data, which does work fine, but has a couple of problems: - it unnecessarily walks the filename twice (at 'stat()' time and then again to open it) - NFS generally has open-close consistency guarantees, which means that the initial 'stat()' was technically done outside of the normal consistency rules. So change it to do 'open()+fstat()+mmap()+close()' instead, which avoids both these issues. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-14 14:39:22 -07:00
Linus Torvalds	5723fe7e3c	Avoid cross-directory renames and linking on object creation Instead of creating new temporary objects in the top-level git object directory, create them in the same directory they will finally end up in anyway. This avoids making the final atomic "rename to stable name" operation be a cross-directory event, which makes it a lot easier for various filesystems. Several filesystems do things like change the inode number when moving files across directories (or refuse to do it entirely). In particular, it can also cause problems for NFS implementations that change the filehandle of a file when it moves to a different directory, like the old user-space NFS server did, and like the Linux knfsd still does if you don't export your filesystems with 'no_subtree_check' or if you export a filesystem that doesn't have stable inode numbers across renames). This change also obviously implies creating the object fan-out subdirectory at tempfile creation time, rather than at the final move_temp_to_file() time. Which actually accounts for most of the size of the patch. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-14 14:39:22 -07:00
Junio C Hamano	6483925999	sha1_file.c: dead code removal write_sha1_from_fd() and write_sha1_to_fd() were dead code nobody called, neither the latter's helper repack_object() was. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-13 23:00:51 -07:00
Linus Torvalds	e9039dd351	Consolidate SHA1 object file close This consolidates the common operations for closing the new temporary file that we have written, before we move it into place with the final name. There's some common code there (make it read-only and check for errors on close), but more importantly, this also gives a single place to add an fsync_or_die() call if we want to add a safe mode. This was triggered due to Denis Bueno apparently twice being able to corrupt his git repository on OS X due to an unlucky combination of kernel crashes and a not-very-robust filesystem. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-10 22:23:18 -07:00
Junio C Hamano	6eec46bdda	fix sha1_pack_index_name() An earlier commit `633f43e` (Remove redundant code, eliminate one static variable, 2008-05-24) had a thinko (perhaps an eyeno) that broke sha1_pack_index_name() function. One symptom of this was that the http walker is now completely broken. This should fix it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-28 10:24:32 -07:00
Junio C Hamano	b84c343c88	Merge branch 'db/clone-in-c' * db/clone-in-c: Add test for cloning with "--reference" repo being a subset of source repo Add a test for another combination of --reference Test that --reference actually suppresses fetching referenced objects clone: fall back to copying if hardlinking fails builtin-clone.c: Need to closedir() in copy_or_link_directory() builtin-clone: fix initial checkout Build in clone Provide API access to init_db() Add a function to set a non-default work tree Allow for having for_each_ref() list extra refs Have a constant extern refspec for "--tags" Add a library function to add an alternate to the alternates file Add a lockfile function to append to a file Mark the list of refs to fetch as const Conflicts: cache.h t/t5700-clone-reference.sh	2008-05-25 13:41:37 -07:00
Heikki Orsila	633f43e1f7	Remove redundant code, eliminate one static variable Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-24 22:05:06 -07:00
Nicolas Pitre	bbac73117e	add a force_object_loose() function This is meant to force the creation of a loose object even if it already exists packed. Needed for the next commit. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-13 22:42:33 -07:00
Daniel Barkalow	bef70b22ba	Add a library function to add an alternate to the alternates file This is in the core so that, if the alternates file has already been read, the addition can be parsed and put into effect for the current process. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-04 17:41:44 -07:00
Heikki Orsila	c697ad143b	Cleanup xread() loops to use read_in_full() Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-03 22:15:25 -07:00
Junio C Hamano	628522ec14	sha1-lookup: more memory efficient search in sorted list of SHA-1 Currently, when looking for a packed object from the pack idx, a simple binary search is used. A conventional binary search loop looks like this: unsigned lo, hi; do { unsigned mi = (lo + hi) / 2; int cmp = "entry pointed at by mi" minus "target"; if (!cmp) return mi; "mi is the wanted one" if (cmp > 0) hi = mi; "mi is larger than target" else lo = mi+1; "mi is smaller than target" } while (lo < hi); "did not find what we wanted" The invariants are: - When entering the loop, 'lo' points at a slot that is never above the target (it could be at the target), 'hi' points at a slot that is guaranteed to be above the target (it can never be at the target). - We find a point 'mi' between 'lo' and 'hi' ('mi' could be the same as 'lo', but never can be as high as 'hi'), and check if 'mi' hits the target. There are three cases: - if it is a hit, we have found what we are looking for; - if it is strictly higher than the target, we set it to 'hi', and repeat the search. - if it is strictly lower than the target, we update 'lo' to one slot after it, because we allow 'lo' to be at the target and 'mi' is known to be below the target. If the loop exits, there is no matching entry. When choosing 'mi', we do not have to take the "middle" but anywhere in between 'lo' and 'hi', as long as lo <= mi < hi is satisfied. When we somehow know that the distance between the target and 'lo' is much shorter than the target and 'hi', we could pick 'mi' that is much closer to 'lo' than (hi+lo)/2, which a conventional binary search would pick. This patch takes advantage of the fact that the SHA-1 is a good hash function, and as long as there are enough entries in the table, we can expect uniform distribution. An entry that begins with for example "deadbeef..." is much likely to appear much later than in the midway of a reasonably populated table. In fact, it can be expected to be near 87% (222/256) from the top of the table. This is a work-in-progress and has switches to allow easier experiments and debugging. Exporting GIT_USE_LOOKUP environment variable enables this code. On my admittedly memory starved machine, with a partial KDE repository (3.0G pack with 95M idx): $ GIT_USE_LOOKUP=t git log -800 --stat HEAD >/dev/null 3.93user 0.16system 0:04.09elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+55588minor)pagefaults 0swaps Without the patch, the numbers are: $ git log -800 --stat HEAD >/dev/null 4.00user 0.15system 0:04.17elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+60258minor)pagefaults 0swaps In the same repository: $ GIT_USE_LOOKUP=t git log -2000 HEAD >/dev/null 0.12user 0.00system 0:00.12elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+4241minor)pagefaults 0swaps Without the patch, the numbers are: $ git log -2000 HEAD >/dev/null 0.05user 0.01system 0:00.07elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+8506minor)pagefaults 0swaps There isn't much time difference, but the number of minor faults seems to show that we are touching much smaller number of pages, which is expected. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:23:52 -07:00
Nicolas Pitre	70f5d5d31c	fix unimplemented packed_object_info_detail() features Since commit `eb32d236df`, there was a TODO comment in packed_object_info_detail() about the SHA1 of base object to OBJ_OFS_DELTA objects. So here it is at last. While at it, providing the actual storage size information as well is now trivial. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-01 01:44:46 -08:00
Junio C Hamano	c484166374	Merge branch 'jk/empty-tree' * jk/empty-tree: add--interactive: handle initial commit better hard-code the empty tree object	2008-02-20 16:13:28 -08:00
Junio C Hamano	ee4f06c0a6	Merge branch 'mk/maint-parse-careful' * mk/maint-parse-careful: peel_onion: handle NULL check return value from parse_commit() in various functions parse_commit: don't fail, if object is NULL revision.c: handle tag->tagged == NULL reachable.c::process_tree/blob: check for NULL process_tag: handle tag->tagged == NULL check results of parse_commit in merge_bases list-objects.c::process_tree/blob: check for NULL reachable.c::add_one_tree: handle NULL from lookup_tree mark_blob/tree_uninteresting: check for NULL get_sha1_oneline: check return value of parse_object read_object_with_reference: don't read beyond the buffer	2008-02-18 20:56:01 -08:00
Martin Koegler	50974ec994	read_object_with_reference: don't read beyond the buffer Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-18 19:20:17 -08:00
Jeff King	346245a1bb	hard-code the empty tree object Now any commands may reference the empty tree object by its sha1 (`4b825dc642`). This is useful for showing some diffs, especially for initial commits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-13 13:44:17 -08:00
Steffen Prohaska	21e5ad50fc	safecrlf: Add mechanism to warn about irreversible crlf conversions CRLF conversion bears a slight chance of corrupting data. autocrlf=true will convert CRLF to LF during commit and LF to CRLF during checkout. A file that contains a mixture of LF and CRLF before the commit cannot be recreated by git. For text files this is the right thing to do: it corrects line endings such that we have only LF line endings in the repository. But for binary files that are accidentally classified as text the conversion can corrupt data. If you recognize such corruption early you can easily fix it by setting the conversion type explicitly in .gitattributes. Right after committing you still have the original file in your work tree and this file is not yet corrupted. You can explicitly tell git that this file is binary and git will handle the file appropriately. Unfortunately, the desired effect of cleaning up text files with mixed line endings and the undesired effect of corrupting binary files cannot be distinguished. In both cases CRLFs are removed in an irreversible way. For text files this is the right thing to do because CRLFs are line endings, while for binary files converting CRLFs corrupts data. This patch adds a mechanism that can either warn the user about an irreversible conversion or can even refuse to convert. The mechanism is controlled by the variable core.safecrlf, with the following values: - false: disable safecrlf mechanism - warn: warn about irreversible conversions - true: refuse irreversible conversions The default is to warn. Users are only affected by this default if core.autocrlf is set. But the current default of git is to leave core.autocrlf unset, so users will not see warnings unless they deliberately chose to activate the autocrlf mechanism. The safecrlf mechanism's details depend on the git command. The general principles when safecrlf is active (not false) are: - we warn/error out if files in the work tree can modified in an irreversible way without giving the user a chance to backup the original file. - for read-only operations that do not modify files in the work tree we do not not print annoying warnings. There are exceptions. Even though... - "git add" itself does not touch the files in the work tree, the next checkout would, so the safety triggers; - "git apply" to update a text file with a patch does touch the files in the work tree, but the operation is about text files and CRLF conversion is about fixing the line ending inconsistencies, so the safety does not trigger; - "git diff" itself does not touch the files in the work tree, it is often run to inspect the changes you intend to next "git add". To catch potential problems early, safety triggers. The concept of a safety check was originally proposed in a similar way by Linus Torvalds. Thanks to Dimitry Potapov for insisting on getting the naked LF/autocrlf=true case right. Signed-off-by: Steffen Prohaska <prohaska@zib.de>	2008-02-06 13:07:28 -08:00
Shawn O. Pearce	c9ced051c3	Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by `c3b0dec509`) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-17 22:39:20 -08:00
Jim Meyering	790296fd88	Fix grammar nits in documentation and in code comments. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-03 09:15:17 -08:00
Steffen Prohaska	9e42d6a1c5	sha1_file.c: Fix size_t related printf format warnings The old way of fixing warnings did not succeed on MinGW. MinGW does not support C99 printf format strings for size_t [1]. But gcc on MinGW issues warnings if C99 printf format is not used. Hence, the old stragegy to avoid warnings fails. [1] http://www.mingw.org/MinGWiki/index.php/C99 This commits passes arguments of type size_t through a tiny helper functions that casts to the type expected by the format string. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-11-28 16:03:38 -08:00
Johannes Sixt	85dadc3894	Use is_absolute_path() in sha1_file.c. There are some places that test for an absolute path. Use the helper function to ease porting. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-11-14 15:18:39 -08:00
Junio C Hamano	e2b7eaf0ca	Merge branch 'maint' * maint: RelNotes-1.5.3.5: describe recent fixes merge-recursive.c: mrtree in merge() is not used before set sha1_file.c: avoid gcc signed overflow warnings Fix a small memory leak in builtin-add honor the http.sslVerify option in shell scripts	2007-10-29 12:53:54 -07:00
Junio C Hamano	7109c889f1	sha1_file.c: avoid gcc signed overflow warnings With the recent gcc, we get: sha1_file.c: In check_packed_git_: sha1_file.c:527: warning: assuming signed overflow does not occur when assuming that (X + c) < X is always false sha1_file.c:527: warning: assuming signed overflow does not occur when assuming that (X + c) < X is always false for a piece of code that tries to make sure that off_t is large enough to hold more than 2^32 offset. The test tried to make sure these do not wrap-around: /* make sure we can deal with large pack offsets */ off_t x = 0x7fffffffUL, y = 0xffffffffUL; if (x > (x + 1) \|\| y > (y + 1)) { but gcc assumes it can do whatever optimization it wants for a signed overflow (undefined behaviour) and warns about this construct. Follow Linus's suggestion to check sizeof(off_t) instead to work around the problem. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-29 11:56:57 -07:00
Junio C Hamano	66d4035e10	Merge branch 'ph/strbuf' * ph/strbuf: (44 commits) Make read_patch_file work on a strbuf. strbuf_read_file enhancement, and use it. strbuf change: be sure ->buf is never ever NULL. double free in builtin-update-index.c Clean up stripspace a bit, use strbuf even more. Add strbuf_read_file(). rerere: Fix use of an empty strbuf.buf Small cache_tree_write refactor. Make builtin-rerere use of strbuf nicer and more efficient. Add strbuf_cmp. strbuf_setlen(): do not barf on setting length of an empty buffer to 0 sq_quote_argv and add_to_string rework with strbuf's. Full rework of quote_c_style and write_name_quoted. Rework unquote_c_style to work on a strbuf. strbuf API additions and enhancements. nfv?asprintf are broken without va_copy, workaround them. Fix the expansion pattern of the pseudo-static path buffer. builtin-for-each-ref.c::copy_name() - do not overstep the buffer. builtin-apply.c: fix a tiny leak introduced during xmemdupz() conversion. Use xmemdupz() in many places. ...	2007-10-03 03:06:02 -07:00
Pierre Habouzit	b315c5c081	strbuf change: be sure ->buf is never ever NULL. For that purpose, the ->buf is always initialized with a char * buf living in the strbuf module. It is made a char * so that we can sloppily accept things that perform: sb->buf[0] = '\0', and because you can't pass "" as an initializer for ->buf without making gcc unhappy for very good reasons. strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf anymore. as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying ->buf isn't an option anymore, if ->buf is going to escape from the scope, and eventually be free'd. API changes: * strbuf_setlen now always works, so just make strbuf_reset a convenience macro. * strbuf_detatch takes a size_t* optional argument (meaning it can be NULL) to copy the buffer's len, as it was needed for this refactor to make the code more readable, and working like the callers. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-29 02:13:33 -07:00
Pierre Habouzit	182af8343c	Use xmemdupz() in many places. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-18 17:42:17 -07:00
Junio C Hamano	000dfd3f6e	Export matches_pack_name() and fix its return value The function sounds boolean; make it behave as one, not "0 for success, non-zero for failure". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-17 12:25:26 -07:00
Pierre Habouzit	ba3ed09728	Now that cache.h needs strbuf.h, remove useless includes. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-16 17:30:03 -07:00
Pierre Habouzit	5ecd293d14	Rewrite convert_to_{git,working_tree} to use strbuf's. * Now, those functions take an "out" strbuf argument, where they store their result if any. In that case, it also returns 1, else it returns 0. * those functions support "in place" editing, in the sense that it's OK to call them this way: convert_to_git(path, sb->buf, sb->len, sb); When doable, conversions are done in place for real, else the strbuf content is just replaced with the new one, transparentely for the caller. If you want to create a new filter working this way, being the accumulation of filter1, filter2, ... filtern, then your meta_filter would be: int meta_filter(..., const char src, size_t len, struct strbuf sb) { int ret = 0; ret \|= filter1(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } ret \|= filter2(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } .... return ret \| filtern(..., src, len, sb); } That's why subfilters the convert_to_* functions called were also rewritten to work this way. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-16 17:30:03 -07:00
Pierre Habouzit	fd17f5b5f7	Replace all read_fd use with strbuf_read, and get rid of it. This brings builtin-stripspace, builtin-tag and mktag to use strbufs. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-10 12:50:58 -07:00
Shawn O. Pearce	9064d87b06	Don't segfault if we failed to inflate a packed delta Under some types of packfile corruption the zlib stream holding the data for a delta within a packfile may fail to inflate, due to say a CRC failure within the compressed data itself. When this occurs the unpack_compressed_entry function will return NULL as a signal to the caller that the data is not available. Unfortunately we then tried to use that NULL as though it referenced a memory location where a delta was stored and tried to apply it to the delta base. Loading a byte from the NULL address typically causes a SIGSEGV. cate on #git noticed this failure in `git fsck --full` where the call to verify_pack() first noticed that the packfile was corrupt by finding that the packfile's SHA-1 did not match the raw data of the file. After finding this fsck went ahead and tried to verify every object within the packfile, even though the packfile was already known to be bad. If we are going to shovel bad data at the delta unpacking code, we better handle it correctly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-08-25 08:33:47 -07:00
Luiz Fernando N. Capitulino	eef427a09c	Avoid ambiguous error message if pack.idx header is wrong Print the index version when an error occurs so the user knows what type of header (and size) we thought the index should have had. Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-08-14 22:20:13 -07:00
Carlos Rica	c4fba0a358	Rename read_pipe() with read_fd() and make its buffer nul-terminated. The new name is closer to the purpose of the function. A NUL-terminated buffer makes things easier when callers need that. Since the function returns only the memory written with data, almost always allocating more space than needed because final size is unknown, an extra NUL terminating the buffer is harmless. It is not included in the returned size, so the function remains working as before. Also, now the function allows the buffer passed to be NULL at first, and alloc_nr is now used for growing the buffer, instead size=*2. Signed-off-by: Carlos Rica <jasampler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-18 17:30:03 -07:00
Junio C Hamano	e2b1accc59	Merge branch 'maint' * maint: Document -<n> for git-format-patch glossary: add 'reflog' diff --no-index: fix --name-status with added files Don't smash stack when $GIT_ALTERNATE_OBJECT_DIRECTORIES is too long	2007-07-03 22:56:59 -07:00
Jim Meyering	9cb18f56fd	Don't smash stack when $GIT_ALTERNATE_OBJECT_DIRECTORIES is too long There is no restriction on the length of the name returned by get_object_directory, other than the fact that it must be a stat'able git object directory. That means its name may have length up to PATH_MAX-1 (i.e., often 4095) not counting the trailing NUL. Combine that with the assumption that the concatenation of that name and suffixes like "/info/alternates" and "/pack/---long-name---.idx" will fit in a buffer of length PATH_MAX, and you see the problem. Here's a fix: sha1_file.c (prepare_packed_git_one): Lengthen "path" buffer so we are guaranteed to be able to append "/pack/" without checking. Skip any directory entry that is too long to be appended. (read_info_alternates): Protect against a similar buffer overrun. Before this change, using the following admittedly contrived environment setting would cause many git commands to clobber their stack and segfault on a system with PATH_MAX == 4096: t=$(perl -e '$s=".git/objects";$n=(4096-6-length($s))/2;print "./"x$n . $s') export GIT_ALTERNATE_OBJECT_DIRECTORIES=$t touch g ./git-update-index --add g If you run the above commands, you'll soon notice that many git commands now segfault, so you'll want to do this: unset GIT_ALTERNATE_OBJECT_DIRECTORIES Signed-off-by: Jim Meyering <jim@meyering.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-03 12:25:29 -07:00
Junio C Hamano	68fb465049	Merge branch 'maint' * maint: config: Change output of --get-regexp for valueless keys config: Complete documentation of --get-regexp cleanup merge-base test script Fix zero-object version-2 packs Ignore submodule commits when fetching over dumb protocols	2007-06-26 18:45:29 -07:00
Linus Torvalds	1164f1e48d	Fix zero-object version-2 packs A pack-file can get created without any objects in it (to transfer "no data" - which can happen if you use a reference git repo, for example, or just otherwise just end up transferring only branch head information and already have all the objects themselves). And while we probably should never create an index for such a pack, if we do (and we do), the index file size sanity checking was incorrect. This fixes it. Reported-and-tested-by: Jocke Tjernlund <tjernlund@tjernlund.se> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-26 18:02:15 -07:00
Junio C Hamano	4175e9e3a8	More static There still are quite a few symbols that ought to be static. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-13 02:02:10 -07:00
Junio C Hamano	b79d18c92d	-Wold-style-definition fix Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-13 02:02:10 -07:00
Junio C Hamano	a6080a0a44	War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-07 00:04:01 -07:00
Junio C Hamano	17c2929aa2	Merge branch 'sp/pack' * sp/pack: Style nit - don't put space after function names Ensure the pack index is opened before access Simplify index access condition in count-objects, pack-redundant Test for recent rev-parse $abbrev_sha1 regression rev-parse: Identify short sha1 sums correctly. Attempt to delay prepare_alt_odb during get_sha1 Micro-optimize prepare_alt_odb Lazily open pack index files on demand	2007-06-02 12:18:51 -07:00
Junio C Hamano	bd724be4be	Merge branch 'maint' * maint: git-config: Improve documentation of git-config file handling git-config: Various small fixes to asciidoc documentation decode_85(): fix missing return. fix signed range problems with hex conversions	2007-05-31 00:15:14 -07:00
Junio C Hamano	8e29f903eb	Merge branch 'maint-1.5.1' into maint * maint-1.5.1: git-config: Improve documentation of git-config file handling git-config: Various small fixes to asciidoc documentation decode_85(): fix missing return. fix signed range problems with hex conversions	2007-05-31 00:09:26 -07:00
Nicolas Pitre	f7c22cc68c	always start looking up objects in the last used pack first Jon Smirl said: \| Once an object reference hits a pack file it is very likely that \| following references will hit the same pack file. So first place to \| look for an object is the same place the previous object was found. This is indeed a good heuristic so here it is. The search always start with the pack where the last object lookup succeeded. If the wanted object is not available there then the search continues with the normal pack ordering. To test this I split the Linux repository into 66 packs and performed a "time git-rev-list --objects --all > /dev/null". Best results are as follows: Pack Sort w/o this patch w/ this patch ------------------------------------------------------------- recent objects last 26.4s 20.9s recent objects first 24.9s 18.4s This shows that the pack order based on object age has some influence, but that the last-used-pack heuristic is even more significant in reducing object lookup. Signed-off-by: Nicolas Pitre <nico@cam.org> --- Note: the --max-pack-size to git-repack currently produces packs with old objects after those containing recent objects. The pack sort based on filesystem timestamp is therefore backward for those. This needs to be fixed of course, but at least it made me think about this variable for the test. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-05-30 23:35:07 -07:00
Linus Torvalds	192a6be2a7	fix signed range problems with hex conversions Make hexval_table[] "const". Also make sure that the accessor function hexval() does not access the table with out-of-range values by declaring its parameter "unsigned char", instead of "unsigned int". With this, gcc can just generate: movzbl (%rdi), %eax movsbl hexval_table(%rax),%edx movzbl 1(%rdi), %eax movsbl hexval_table(%rax),%eax sall $4, %edx orl %eax, %edx for the code to generate a byte from two hex characters. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-05-30 15:01:37 -07:00
Shawn O. Pearce	bc8e478a28	Style nit - don't put space after function names Our style is to not put a space after a function name. I did here, and Junio applied the patch with the incorrect formatting. So I'm cleaning up after myself since I noticed it upon review. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-05-29 23:31:19 -07:00
Shawn O. Pearce	7dc24aa5a6	Micro-optimize prepare_alt_odb Calling getenv() is not that expensive, but its also not free, and its certainly not cheaper than testing to see if alt_odb_tail is not null. Because we are calling prepare_alt_odb() from within find_sha1_file every time we cannot find an object file locally we want to skip out of prepare_alt_odb() as early as possible once we have initialized our alternate list. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-05-26 20:28:08 -07:00
Shawn O. Pearce	d079837eee	Lazily open pack index files on demand In some repository configurations the user may have many packfiles, but all of the recent commits/trees/tags/blobs are likely to be in the most recent packfile (the one with the newest mtime). It is therefore common to be able to complete an entire operation by accessing only one packfile, even if there are 25 packfiles available to the repository. Rather than opening and mmaping the corresponding .idx file for every pack found, we now only open and map the .idx when we suspect there might be an object of interest in there. Of course we cannot known in advance which packfile contains an object, so we still need to scan the entire packed_git list to locate anything. But odds are users want to access objects in the most recently created packfiles first, and that may be all they ever need for the current operation. Junio observed in `b867092f` that placing recent packfiles before older ones can slightly improve access times for recent objects, without degrading it for historical object access. This change improves upon Junio's observations by trying even harder to avoid the .idx files that we won't need. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-05-26 20:28:08 -07:00

1 2 3 4 5 ...

422 Commits