git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	3ba7a06552	A loose object is not corrupt if it cannot be read due to EMFILE "git fsck" bails out with a claim that a loose object that cannot be read but exists on the filesystem to be corrupt, which is wrong when read_object() failed due to e.g. EMFILE. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-11-03 09:24:57 -07:00
Junio C Hamano	b6c4ceccb3	read_sha1_file(): report correct name of packfile with a corrupt object Clarify the error reporting logic by moving the normal codepath (i.e. we read the object we wanted to read correctly) up and return early. The logic to report the name of the packfile with a corrupt object, introduced by `e8b15e6` (sha1_file: Show the the type and path to corrupt objects, 2010-06-10), was totally bogus. The function that knows which bad object came from what packfile is has_packed_and_bad(); make it report which packfile the problem was found. "Corrupt" is already an adjective, e.g. an object is "corrupt"; we do not have to say "corrupted object". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-11-03 09:24:47 -07:00
Ævar Arnfjörð Bjarmason	e8b15e6156	sha1_file: Show the the type and path to corrupt objects Change the error message that's displayed when we encounter corrupt objects to be more specific. We now print the type (loose or packed) of corrupted objects, along with the full path to the file in question. Before: $ git cat-file blob 909ef997367880aaf2133bafa1f1a71aa28e09df fatal: object 909ef997367880aaf2133bafa1f1a71aa28e09df is corrupted After: $ git cat-file blob 909ef997367880aaf2133bafa1f1a71aa28e09df fatal: loose object 909ef997367880aaf2133bafa1f1a71aa28e09df (stored in .git/objects/90/9ef997367880aaf2133bafa1f1a71aa28e09df) is corrupted Knowing the path helps to quickly analyze what's wrong: $ file .git/objects/90/9ef997367880aaf2133bafa1f1a71aa28e09df .git/objects/90/9ef997367880aaf2133bafa1f1a71aa28e09df: empty Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-07-14 15:35:12 -07:00
Junio C Hamano	e391fdfc69	Merge branch 'jk/maint-sha1-file-name-fix' * jk/maint-sha1-file-name-fix: remove over-eager caching in sha1_file_name	2010-06-13 11:22:00 -07:00
Jeff King	560fb6a183	remove over-eager caching in sha1_file_name This function takes a sha1 and produces a loose object filename. It caches the location of the object directory so that it can fill the sha1 information directly without allocating a new buffer (and in its original incarnation, without calling getenv(), though these days we cache that with the code in environment.c). This cached base directory can become stale, however, if in a single process git changes the location of the object directory (e.g., by running setup_work_tree, which will chdir to the new worktree). In most cases this isn't a problem, because we tend to set up the git repository location and do any chdir()s before actually looking up any objects, so the first lookup will cache the correct location. In the case of reset --hard, however, we do something like: 1. look up the commit object 2. notice we are doing --hard, run setup_work_tree 3. look up the tree object to reset Step (3) fails because our cache object directory value is bogus. This patch simply removes the caching. We use a static buffer instead of allocating one each time (the original version treated the malloc'd buffer as a static, so there is no change in calling semantics). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-05-25 09:21:28 -07:00
Junio C Hamano	035bf8d7c4	Merge branch 'sp/maint-dumb-http-pack-reidx' * sp/maint-dumb-http-pack-reidx: http.c::new_http_pack_request: do away with the temp variable filename http-fetch: Use temporary files for pack-*.idx until verified http-fetch: Use index-pack rather than verify-pack to check packs Allow parse_pack_index on temporary files Extract verify_pack_index for reuse from verify_pack Introduce close_pack_index to permit replacement http.c: Remove unnecessary strdup of sha1_to_hex result http.c: Don't store destination name in request structures http.c: Drop useless != NULL test in finish_http_pack_request http.c: Tiny refactoring of finish_http_pack_request t5550-http-fetch: Use subshell for repository operations http.c: Remove bad free of static block	2010-05-21 04:02:19 -07:00
Junio C Hamano	636e87d705	Merge branch 'maint' * maint: Documentation/gitdiffcore: fix order in pickaxe description Documentation: fix minor inconsistency Documentation: rebase -i ignores options passed to "git am" hash_object: correction for zero length file	2010-05-18 22:39:56 -07:00
Dmitry Potapov	08bda2085c	hash_object: correction for zero length file The check whether size is zero was done after if size <= SMALL_FILE_SIZE, as result, zero size case was never triggered. Instead zero length file was treated as any other small file. This did not caused any problem, but if we have a special case for size equal to zero, it is better to make it work and avoid redundant malloc(). Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-05-18 21:46:36 -07:00
Shawn O. Pearce	7b64469a36	Allow parse_pack_index on temporary files The easiest way to verify a pack index is to open it through the standard parse_pack_index function, permitting the header check to happen when the file is mapped. However, the dumb HTTP client needs to verify a pack index before its moved into its proper file name within the objects/pack directory, to prevent a corrupt index from being made available. So permit the caller to specify the exact path of the index file. For now we're still using the final destination name within the sole call site in http.c, but eventually we will start to parse the temporary path instead. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-04-19 17:56:17 -07:00
Shawn O. Pearce	fa5fc15d6e	Introduce close_pack_index to permit replacement By closing the pack index, a caller can later overwrite the index with an updated index file, possibly after converting from v1 to the v2 format. Because p->index_data is NULL after close, on the next access the index will be opened again and the other members will be updated with new data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-04-19 17:56:08 -07:00
Jeff King	40d52ff77b	make commit_tree a library function Until now, this has been part of the commit-tree builtin. However, it is already used by other builtins (like commit, merge, and notes), and it would be useful to access it from library code. The check_valid helper has to come along, too, but is given a more library-ish name of "assert_sha1_type". Otherwise, the code is unchanged. There are still a few rough edges for a library function, like printing the utf8 warning to stderr, but we can address those if and when they come up as inappropriate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-04-01 23:53:54 -07:00
Jeff King	c00e657df2	fix const-correctness of write_sha1_file These should take const buffers as input data, but zlib's next_in pointer is not const-correct. Let's fix it at the zlib level, though, so the cast happens in one obvious place. This should be safe, as a similar cast is used in zlib's example code for a const array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-04-01 23:49:03 -07:00
Junio C Hamano	493e433277	Merge branch 'mm/mkstemps-mode-for-packfiles' into maint * mm/mkstemps-mode-for-packfiles: Use git_mkstemp_mode instead of plain mkstemp to create object files git_mkstemps_mode: don't set errno to EINVAL on exit. Use git_mkstemp_mode and xmkstemp_mode in odb_mkstemp, not chmod later. git_mkstemp_mode, xmkstemp_mode: variants of gitmkstemps with mode argument. Move gitmkstemps to path.c Add a testcase for ACL with restrictive umask.	2010-03-08 00:36:00 -08:00
Junio C Hamano	c2b456b895	Merge branch 'nd/root-git' * nd/root-git: Add test for using Git at root of file system Support working directory located at root Move offset_1st_component() to path.c init-db, rev-parse --git-dir: do not append redundant slash make_absolute_path(): Do not append redundant slash Conflicts: setup.c sha1_file.c	2010-03-07 12:47:15 -08:00
Junio C Hamano	87912fd617	Merge branch 'mm/mkstemps-mode-for-packfiles' * mm/mkstemps-mode-for-packfiles: Use git_mkstemp_mode instead of plain mkstemp to create object files git_mkstemps_mode: don't set errno to EINVAL on exit. Use git_mkstemp_mode and xmkstemp_mode in odb_mkstemp, not chmod later. git_mkstemp_mode, xmkstemp_mode: variants of gitmkstemps with mode argument. Move gitmkstemps to path.c Add a testcase for ACL with restrictive umask.	2010-03-07 12:47:14 -08:00
Junio C Hamano	780fc9a0a6	Merge branch 'dp/read-not-mmap-small-loose-object' into maint * dp/read-not-mmap-small-loose-object: hash-object: don't use mmap() for small files	2010-03-04 22:26:17 -08:00
Junio C Hamano	34c014d13e	Merge branch 'np/compress-loose-object-memsave' * np/compress-loose-object-memsave: sha1_file: be paranoid when creating loose objects sha1_file: don't malloc the whole compressed result when writing out objects	2010-03-02 12:44:09 -08:00
Matthieu Moy	5256b00631	Use git_mkstemp_mode instead of plain mkstemp to create object files We used to unnecessarily give the read permission to group and others, regardless of the umask, which isn't serious because the objects are still protected by their containing directory, but isn't necessary either. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-02-22 15:24:46 -08:00
Nicolas Pitre	748af44c63	sha1_file: be paranoid when creating loose objects We don't want the data being deflated and stored into loose objects to be different from what we expect. While the deflated data is protected by a CRC which is good enough for safe data retrieval operations, we still want to be doubly sure that the source data used at object creation time is still what we expected once that data has been deflated and its CRC32 computed. The most plausible data corruption may occur if the source file is modified while Git is deflating and writing it out in a loose object. Or Git itself could have a bug causing memory corruption. Or even bad RAM could cause trouble. So it is best to make sure everything is coherent and checksum protected from beginning to end. To do so we compute the SHA1 of the data being deflated _after_ the deflate operation has consumed that data, and make sure it matches with the expected SHA1. This way we can rely on the CRC32 checked by the inflate operation to provide a good indication that the data is still coherent with its SHA1 hash. One pathological case we ignore is when the data is modified before (or during) deflate call, but changed back before it is hashed. There is some overhead of course. Using 'git add' on a set of large files: Before: real 0m25.210s user 0m23.783s sys 0m1.408s After: real 0m26.537s user 0m25.175s sys 0m1.358s The overhead is around 5% for full data coherency guarantee. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-02-21 22:33:25 -08:00
Dmitry Potapov	ea68b0ce9f	hash-object: don't use mmap() for small files Using read() instead of mmap() can be 39% speed up for 1Kb files and is 1% speed up 1Mb files. For larger files, it is better to use mmap(), because the difference between is not significant, and when there is not enough memory, mmap() performs much better, because it avoids swapping. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-02-21 11:39:10 -08:00
Nicolas Pitre	9892bebafe	sha1_file: don't malloc the whole compressed result when writing out objects There is no real advantage to malloc the whole output buffer and deflate the data in a single pass when writing loose objects. That is like only 1% faster while using more memory, especially with large files where memory usage is far more. It is best to deflate and write the data out in small chunks reusing the same memory instead. For example, using 'git add' on a few large files averaging 40 MB ... Before: 21.45user 1.10system 0:22.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+828040outputs (0major+142640minor)pagefaults 0swaps After: 21.50user 1.25system 0:22.76elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+828040outputs (0major+104408minor)pagefaults 0swaps While the runtime stayed relatively the same, the number of minor page faults went down significantly. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-02-21 11:36:23 -08:00
Nguyễn Thái Ngọc Duy	4bb43de259	Move offset_1st_component() to path.c The implementation is also lightly modified to use is_dir_sep() instead of hardcoding '/'. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-02-16 08:54:34 -08:00
Junio C Hamano	a0075d9e6a	Merge branch 'il/maint-xmallocz' * il/maint-xmallocz: Fix integer overflow in unpack_compressed_entry() Fix integer overflow in unpack_sha1_rest() Fix integer overflow in patch_delta() Add xmallocz()	2010-01-27 14:56:38 -08:00
Ilari Liusvaara	4ab07e4d10	Fix integer overflow in unpack_compressed_entry() Signed-off-by: Ilari Liusvaara <ilari.liusvaara@elisanet.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-26 13:00:16 -08:00
Ilari Liusvaara	3aee68aa68	Fix integer overflow in unpack_sha1_rest() [jc: later NUL termination by the caller becomes unnecessary] Signed-off-by: Ilari Liusvaara <ilari.liusvaara@elisanet.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-26 13:00:10 -08:00
Linus Torvalds	a5031214c4	slim down "git show-index" As the documentation says, this is primarily for debugging, and in the longer term we should rename it to test-show-index or something. In the meantime, just avoid xmalloc (which slurps in the rest of git), and separating out the trivial hex functions into "hex.o". This results in [torvalds@nehalem git]$ size git-show-index text data bss dec hex filename 222818 2276 112688 337782 52776 git-show-index (before) 5696 624 1264 7584 1da0 git-show-index (after) which is a whole lot better. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-21 20:03:45 -08:00
Junio C Hamano	356521ab22	sha1_file.c: remove unused function has_pack_file() is not used anywhere. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-12 01:06:09 -08:00
Junio C Hamano	39eea7bdd9	Fix incorrect error check while reading deflated pack data The loop in get_size_from_delta() feeds a deflated delta data from the pack stream _until_ we get inflated result of 20 bytes[] or we reach the end of stream. Side note. This magic number 20 does not have anything to do with the size of the hash we use, but comes from `1a3b55c` (reduce delta head inflated size, 2006-10-18). The loop reads like this: do { in = use_pack(); stream.next_in = in; st = git_inflate(&stream, Z_FINISH); curpos += stream.next_in - in; } while ((st == Z_OK \|\| st == Z_BUF_ERROR) && stream.total_out < sizeof(delta_head)); This git_inflate() can return: - Z_STREAM_END, if use_pack() fed it enough input and the delta itself was smaller than 20 bytes; - Z_OK, when some progress has been made; - Z_BUF_ERROR, if no progress is possible, because we either ran out of input (due to corrupt pack), or we ran out of output before we saw the end of the stream. The fix `b3118bd` (sha1_file: Fix infinite loop when pack is corrupted, 2009-10-14) attempted was against a corruption that appears to be a valid stream that produces a result larger than the output buffer, but we are not even trying to read the stream to the end in this loop. If avail_out becomes zero, total_out will be the same as sizeof(delta_head) so the loop will terminate without the "fix". There is no fix from `b3118bd` needed for this loop, in other words. The loop in unpack_compressed_entry() is quite a different story. It feeds a deflated stream (either delta or base) and allows the stream to produce output up to what we expect but no more. do { in = use_pack(); stream.next_in = in; st = git_inflate(&stream, Z_FINISH); curpos += stream.next_in - in; } while (st == Z_OK \|\| st == Z_BUF_ERROR) This _does_ risk falling into an endless interation, as we can exhaust avail_out if the length we expect is smaller than what the stream wants to produce (due to pack corruption). In such a case, avail_out will become zero and inflate() will return Z_BUF_ERROR, while avail_in may (or may not) be zero. But this is not a right fix: do { in = use_pack(); stream.next_in = in; st = git_inflate(&stream, Z_FINISH); + if (st == Z_BUF_ERROR && (stream.avail_in \|\| !stream.avail_out) + break; / wants more input??? */ curpos += stream.next_in - in; } while (st == Z_OK \|\| st == Z_BUF_ERROR) as Z_BUF_ERROR from inflate() may be telling us that avail_in has also run out before reading the end of stream marker. In such a case, both avail_in and avail_out would be zero, and the loop should iterate to allow the end of stream marker to be seen by inflate from the input stream. The right fix for this loop is likely to be to increment the initial avail_out by one (we allocate one extra byte to terminate it with NUL anyway, so there is no risk to overrun the buffer), and break out if we see that avail_out has become zero, in order to detect that the stream wants to produce more than what we expect. After the loop, we have a check that exactly tests this condition: if ((st != Z_STREAM_END) \|\| stream.total_out != size) { free(buffer); return NULL; } So here is a patch (without my previous botched attempts) to fix this issue. The first hunk reverts the corresponding hunk from `b3118bd`, and the second hunk is the same fix proposed earlier. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-10-21 23:19:47 -07:00
Shawn O. Pearce	b3118bdc91	sha1_file: Fix infinite loop when pack is corrupted Some types of corruption to a pack may confuse the deflate stream which stores an object. In Andy's reported case a 36 byte region of the pack was overwritten, leading to what appeared to be a valid deflate stream that was trying to produce a result larger than our allocated output buffer could accept. Z_BUF_ERROR is returned from inflate() if either the input buffer needs more input bytes, or the output buffer has run out of space. Previously we only considered the former case, as it meant we needed to move the stream's input buffer to the next window in the pack. We now abort the loop if inflate() returns Z_BUF_ERROR without consuming the entire input buffer it was given, or has filled the entire output buffer but has not yet returned Z_STREAM_END. Either state is a clear indicator that this loop is not working as expected, and should not continue. This problem cannot occur with loose objects as we open the entire loose object as a single buffer and treat Z_BUF_ERROR as an error. Reported-by: Andy Isaacson <adi@hexapodia.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-10-14 13:39:37 -07:00
Junio C Hamano	f00ecbe42b	Merge branch 'cc/replace' * cc/replace: t6050: check pushing something based on a replaced commit Documentation: add documentation for "git replace" Add git-replace to .gitignore builtin-replace: use "usage_msg_opt" to give better error messages parse-options: add new function "usage_msg_opt" builtin-replace: teach "git replace" to actually replace Add new "git replace" command environment: add global variable to disable replacement mktag: call "check_sha1_signature" with the replacement sha1 replace_object: add a test case object: call "check_sha1_signature" with the replacement sha1 sha1_file: add a "read_sha1_file_repl" function replace_object: add mechanism to replace objects found in "refs/replace/" refs: add a "for_each_replace_ref" function	2009-08-21 18:47:53 -07:00
Pierre Habouzit	f630cfda88	refactor: use bitsizeof() instead of 8 * sizeof() Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-07-22 21:57:41 -07:00
Junio C Hamano	dd787c19c4	Merge branch 'tr/die_errno' * tr/die_errno: Use die_errno() instead of die() when checking syscalls Convert existing die(..., strerror(errno)) to die_errno() die_errno(): double % in strerror() output just in case Introduce die_errno() that appends strerror(errno) to die()	2009-07-06 09:39:46 -07:00
Thomas Rast	d824cbba02	Convert existing die(..., strerror(errno)) to die_errno() Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-06-27 11:14:53 -07:00
Linus Torvalds	48fb7deb5b	Fix big left-shifts of unsigned char Shifting 'unsigned char' or 'unsigned short' left can result in sign extension errors, since the C integer promotion rules means that the unsigned char/short will get implicitly promoted to a signed 'int' due to the shift (or due to other operations). This normally doesn't matter, but if you shift things up sufficiently, it will now set the sign bit in 'int', and a subsequent cast to a bigger type (eg 'long' or 'unsigned long') will now sign-extend the value despite the original expression being unsigned. One example of this would be something like unsigned long size; unsigned char c; size += c << 24; where despite all the variables being unsigned, 'c << 24' ends up being a signed entity, and will get sign-extended when then doing the addition in an 'unsigned long' type. Since git uses 'unsigned char' pointers extensively, we actually have this bug in a couple of places. I may have missed some, but this is the result of looking at git grep '[^0-9 ][ ]<<[ ][a-z]' -- '.c' '.h' git grep '<<[ ]24' which catches at least the common byte cases (shifting variables by a variable amount, and shifting by 24 bits). I also grepped for just 'unsigned char' variables in general, and converted the ones that most obviously ended up getting implicitly cast immediately anyway (eg hash_name(), encode_85()). In addition to just avoiding 'unsigned char', this patch also tries to use a common idiom for the delta header size thing. We had three different variations on it: "& 0x7fUL" in one place (getting the sign extension right), and "& ~0x80" and "& 0x7f" in two other places (not getting it right). Apart from making them all just avoid using "unsigned char" at all, I also unified them to then use a simple "& 0x7f". I considered making a sparse extension which warns about doing implicit casts from unsigned types to signed types, but it gets rather complex very quickly, so this is just a hack. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-06-18 09:22:46 -07:00
Christian Couder	f5552aee39	sha1_file: add a "read_sha1_file_repl" function This new function will replace "read_sha1_file". This latter function becoming just a stub to call the former will a NULL "replacement" argument. This new function is needed because sometimes we need to use the replacement sha1. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-05-31 17:02:59 -07:00
Christian Couder	6809557029	replace_object: add mechanism to replace objects found in "refs/replace/" The code implementing this mechanism has been copied more-or-less from the commit graft code. This mechanism is used in "read_sha1_file". sha1 passed to this function that match a ref name in "refs/replace/" are replaced by the sha1 that has been read in the ref. We "die" if the replacement recursion depth is too high or if we can't read the replacement object. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-05-31 17:02:59 -07:00
Junio C Hamano	2c5942dbae	Merge branch 'ar/unlink-err' into maint * ar/unlink-err: print unlink(2) errno in copy_or_link_directory replace direct calls to unlink(2) with unlink_or_warn Introduce an unlink(2) wrapper which gives warning if unlink failed	2009-05-25 19:01:50 -07:00
Junio C Hamano	065b0702f7	Merge branch 'maint' * maint: grep: fix word-regexp colouring completion: use git rev-parse to detect bare repos Cope better with a _lot_ of packs for-each-ref: fix segfault in copy_email	2009-05-20 18:59:09 -07:00
Johannes Schindelin	fd73ccf279	Cope better with a _lot_ of packs You might end up with a situation where you have tons of pack files, e.g. when using hg2git. In this situation, all kinds of operations may end up with a "too many files open" error. Let's recover gracefully from that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Looks-right-to-me-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-05-20 18:23:06 -07:00
Junio C Hamano	36587681b4	Merge branch 'ar/unlink-err' * ar/unlink-err: print unlink(2) errno in copy_or_link_directory replace direct calls to unlink(2) with unlink_or_warn Introduce an unlink(2) wrapper which gives warning if unlink failed	2009-05-18 09:01:06 -07:00
Felipe Contreras	4b25d091ba	Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-05-01 15:17:31 -07:00
Alex Riesen	691f1a28bf	replace direct calls to unlink(2) with unlink_or_warn This helps to notice when something's going wrong, especially on systems which lock open files. I used the following criteria when selecting the code for replacement: - it was already printing a warning for the unlink failures - it is in a function which already printing something or is called from such a function - it is in a static function, returning void and the function is only called from a builtin main function (cmd_) - it is in a function which handles emergency exit (signal handlers) - it is in a function which is obvously cleaning up the lockfiles Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-04-29 18:37:41 -07:00
Johannes Schindelin	348df16679	Rename core.unreliableHardlinks to core.createObject "Unreliable hardlinks" is a misleading description for what is happening. So rename it to something less misleading. Suggested by Linus Torvalds. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-04-29 16:50:07 -07:00
Johannes Schindelin	be66a6c43d	Add an option not to use link(src, dest) && unlink(src) when that is unreliable It seems that accessing NTFS partitions with ufsd (at least on my EeePC) has an unnerving bug: if you link() a file and unlink() it right away, the target of the link() will have the correct size, but consist of NULs. It seems as if the calls are simply not serialized correctly, as single-stepping through the function move_temp_to_file() works flawlessly. As ufsd is "Commertial software" (sic!), I cannot fix it, and have to work around it in Git. At the same time, it seems that this fixes msysGit issues 222 and 229 to assume that Windows cannot handle link() && unlink(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-04-25 09:49:21 -07:00
Junio C Hamano	03a39a9184	Merge branch 'jc/shared-literally' * jc/shared-literally: t1301: loosen test for forced modes set_shared_perm(): sometimes we know what the final mode bits should look like move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath Move chmod(foo, 0444) into move_temp_to_file() "core.sharedrepository = 0mode" should set, not loosen	2009-04-06 00:42:52 -07:00
Junio C Hamano	3c91bf6805	Merge branch 'jc/maint-1.6.0-keep-pack' * jc/maint-1.6.0-keep-pack: pack-objects: don't loosen objects available in alternate or kept packs t7700: demonstrate repack flaw which may loosen objects unnecessarily Remove --kept-pack-only option and associated infrastructure pack-objects: only repack or loosen objects residing in "local" packs git-repack.sh: don't use --kept-pack-only option to pack-objects t7700-repack: add two new tests demonstrating repacking flaws Conflicts: t/t7700-repack.sh	2009-04-01 22:34:19 -07:00
Junio C Hamano	17e61b8288	set_shared_perm(): sometimes we know what the final mode bits should look like adjust_shared_perm() first obtains the mode bits from lstat(2), expecting to find what the result of applying user's umask is, and then tweaks it as necessary. When the file to be adjusted is created with mkstemp(3), however, the mode thusly obtained does not have anything to do with user's umask, and we would need to start from 0444 in such a case and there is no point running lstat(2) for such a path. This introduces a new API set_shared_perm() to bypass the lstat(2) and instead force setting the mode bits to the desired value directly. adjust_shared_perm() becomes a thin wrapper to the function. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-03-28 08:02:15 -07:00
Junio C Hamano	3be1f18e1b	move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath Now move_temp_to_file() is responsible for doing everything that is necessary to turn a tempfile in $GIT_DIR into its final form, it must make sure "Coda hack" codepath correctly makes the file read-only. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-03-28 08:01:21 -07:00
Johan Herland	fb8b193670	Move chmod(foo, 0444) into move_temp_to_file() When writing out a loose object or a pack (index), move_temp_to_file() is called to finalize the resulting file. These files (loose files and packs) should all have permission mode 0444 (modulo adjust_shared_perm()). Therefore, instead of doing chmod(foo, 0444) explicitly from each callsite (or even forgetting to chmod() at all), do the chmod() call from within move_temp_to_file(). Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-03-27 22:10:58 -07:00
Junio C Hamano	5a688fe470	"core.sharedrepository = 0mode" should set, not loosen This fixes the behaviour of octal notation to how it is defined in the documentation, while keeping the traditional "loosen only" semantics intact for "group" and "everybody". Three main points of this patch are: - For an explicit octal notation, the internal shared_repository variable is set to a negative value, so that we can tell "group" (which is to "OR" in 0660) and 0660 (which is to "SET" to 0660); - git-init did not set shared_repository variable early enough to affect the initial creation of many files, notably copied templates and the configuration. We set it very early when a command-line option specifies a custom value. - Many codepaths create files inside $GIT_DIR by various ways that all involve mkstemp(), and then call move_temp_to_file() to rename it to its final destination. We can add adjust_shared_perm() call here; for the traditional "loosen-only", this would be a no-op for many codepaths because the mode is already loose enough, but with the new behaviour it makes a difference. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-03-27 21:51:04 -07:00
Junio C Hamano	89fbda2425	Merge branch 'maint' * maint: Increase the size of the die/warning buffer to avoid truncation close_sha1_file(): make it easier to diagnose errors avoid possible overflow in delta size filtering computation	2009-03-24 19:45:57 -07:00
Junio C Hamano	b0de555410	Merge branch 'maint-1.6.1' into maint * maint-1.6.1: close_sha1_file(): make it easier to diagnose errors avoid possible overflow in delta size filtering computation	2009-03-24 15:31:21 -07:00
Junio C Hamano	2a5643da73	Merge branch 'maint-1.6.0' into maint-1.6.1 * maint-1.6.0: close_sha1_file(): make it easier to diagnose errors avoid possible overflow in delta size filtering computation	2009-03-24 15:31:15 -07:00
Linus Torvalds	e8bd78c3fc	close_sha1_file(): make it easier to diagnose errors A bug report with "unable to write sha1 file" made us realize that we do not have enough information to guess why close() is failing. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-03-24 14:39:20 -07:00
Brandon Casey	4d6acb7041	Remove --kept-pack-only option and associated infrastructure This option to pack-objects/rev-list was created to improve the -A and -a options of repack. It was found to be lacking in that it did not provide the ability to differentiate between local and non-local kept packs, and found to be unnecessary since objects residing in local kept packs can be filtered out by the --honor-pack-keep option. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-03-20 13:32:33 -07:00
Junio C Hamano	aec813062b	Merge branch 'jc/maint-1.6.0-keep-pack' * jc/maint-1.6.0-keep-pack: is_kept_pack(): final clean-up Simplify is_kept_pack() Consolidate ignore_packed logic more has_sha1_kept_pack(): take "struct rev_info" has_sha1_pack(): refactor "pretend these packs do not exist" interface git-repack: resist stray environment variable	2009-03-11 13:49:56 -07:00
Junio C Hamano	69e020ae00	is_kept_pack(): final clean-up Now is_kept_pack() is just a member lookup into a structure, we can write it as such. Also rewrite the sole caller of has_sha1_kept_pack() to switch on the criteria the callee uses (namely, revs->kept_pack_only) between calling has_sha1_kept_pack() and has_sha1_pack(), so that these two callees do not have to take a pointer to struct rev_info as an argument. This removes the header file dependency issue temporarily introduced by the earlier commit, so we revert changes associated to that as well. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-28 01:06:06 -08:00
Junio C Hamano	03a9683d22	Simplify is_kept_pack() This removes --unpacked=<packfile> parameter from the revision parser, and rewrites its use in git-repack to pass a single --kept-pack-only option instead. The new --kept-pack-only option means just that. When this option is given, is_kept_pack() that used to say "not on the --unpacked=<packfile> list" now says "the packfile has corresponding .keep file". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-28 01:06:06 -08:00
Junio C Hamano	386cb77210	Consolidate ignore_packed logic more This refactors three loops that check if a given packfile is on the ignore_packed list into a function is_kept_pack(). The function returns false for a pack on the list, and true for a pack not on the list, because this list is solely used by "git repack" to pass list of packfiles that do not have corresponding .keep files, i.e. a packfile not on the list is "kept". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-28 01:06:06 -08:00
Junio C Hamano	b8431b033f	has_sha1_kept_pack(): take "struct rev_info" Its "ignore_packed" parameter always comes from struct rev_info. This patch makes the function take a pointer to the surrounding structure, so that the refactoring in the next patch becomes easier to review. There is an unfortunate header file dependency and the easiest workaround is to temporarily move the function declaration from cache.h to revision.h; this will be moved back to cache.h once the function loses this "ignore_packed" parameter altogether in the later part of the series. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-28 01:06:06 -08:00
Junio C Hamano	cd673c1f17	has_sha1_pack(): refactor "pretend these packs do not exist" interface Most of the callers of this function except only one pass NULL to its last parameter, ignore_packed. Introduce has_sha1_kept_pack() function that has the function signature and the semantics of this function, and convert the sole caller that does not pass NULL to call this new function. All other callers and has_sha1_pack() lose the ignore_packed parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-28 01:06:06 -08:00
Felipe Contreras	a9d98a148d	sha1_file.c: fix typo it's != its Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-25 00:49:54 -08:00
Junio C Hamano	30aa4fb15f	Merge branch 'maint' * maint: Prepare for 1.6.1.4. Make repack less likely to corrupt repository fast-export: ensure we traverse commits in topological order Clear the delta base cache if a pack is rebuilt Conflicts: RelNotes	2009-02-11 18:47:30 -08:00
Junio C Hamano	7a134dbbc9	Merge branch 'maint-1.6.0' into maint * maint-1.6.0: Make repack less likely to corrupt repository fast-export: ensure we traverse commits in topological order Clear the delta base cache if a pack is rebuilt	2009-02-11 18:32:37 -08:00
Shawn O. Pearce	fa3a0c94dc	Clear the delta base cache if a pack is rebuilt There is some risk that re-opening a regenerated pack file with different offsets could leave stale entries within the delta base cache that could be matched up against other objects using the same "struct packed_git" and pack offset. Throwing away the entire delta base cache in this case is safer, as we don't have to worry about a recycled "struct packed_git" matching to the wrong base object, resulting in delta apply errors while unpacking an object. Suggested-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-11 10:25:24 -08:00
Junio C Hamano	fd8475d9fb	Merge branch 'maint' * maint: Clear the delta base cache during fast-import checkpoint	2009-02-10 21:30:45 -08:00
Junio C Hamano	9b27ea9518	Merge branch 'maint-1.6.0' into maint * maint-1.6.0: Clear the delta base cache during fast-import checkpoint	2009-02-10 15:32:26 -08:00
Shawn O. Pearce	3d20c636af	Clear the delta base cache during fast-import checkpoint Otherwise we may reuse the same memory address for a totally different "struct packed_git", and a previously cached object from the prior occupant might be returned when trying to unpack an object from the new pack. Found-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-02-10 15:30:59 -08:00
Junio C Hamano	141b6b83d7	Merge branch 'lt/maint-wrap-zlib' into maint * lt/maint-wrap-zlib: Wrap inflate and other zlib routines for better error reporting Conflicts: http-push.c http-walker.c sha1_file.c	2009-02-05 18:01:00 -08:00
Junio C Hamano	8c95d3c31b	Sync with 1.6.1.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-29 00:32:52 -08:00
Junio C Hamano	8561b522d7	Merge branch 'maint-1.6.0' into maint * maint-1.6.0: avoid 31-bit truncation in write_loose_object	2009-01-28 23:41:28 -08:00
Jeff King	915308b187	avoid 31-bit truncation in write_loose_object The size of the content we are adding may be larger than 2.1G (i.e., "git add gigantic-file"). Most of the code-path to do so uses size_t or unsigned long to record the size, but write_loose_object uses a signed int. On platforms where "int" is 32-bits (which includes x86_64 Linux platforms), we end up passing malloc a negative size. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-28 23:40:53 -08:00
Junio C Hamano	36dd939393	Merge branch 'lt/maint-wrap-zlib' * lt/maint-wrap-zlib: Wrap inflate and other zlib routines for better error reporting Conflicts: http-push.c http-walker.c sha1_file.c	2009-01-21 16:55:17 -08:00
Christian Couder	c2c5b27051	sha1_file: make "read_object" static This function is only used from "sha1_file.c". And as we want to add a "replace_object" hook in "read_sha1_file", we must not let people bypass the hook using something other than "read_sha1_file". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-13 00:14:55 -08:00
Linus Torvalds	39c68542fc	Wrap inflate and other zlib routines for better error reporting R. Tyler Ballance reported a mysterious transient repository corruption; after much digging, it turns out that we were not catching and reporting memory allocation errors from some calls we make to zlib. This one _just_ wraps things; it doesn't do the "retry on low memory error" part, at least not yet. It is an independent issue from the reporting. Some of the errors are expected and passed back to the caller, but we die when zlib reports it failed to allocate memory for now. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-11 02:13:06 -08:00
Linus Torvalds	b760d3aa74	Make 'index_path()' use 'strbuf_readlink()' This makes us able to properly index symlinks even on filesystems where st_size doesn't match the true size of the link. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-12-17 13:36:34 -08:00
Junio C Hamano	de0db42278	Merge branch 'maint' * maint: fsck: reduce stack footprint make sure packs to be replaced are closed beforehand	2008-12-11 00:36:31 -08:00
Nicolas Pitre	c74faea19e	make sure packs to be replaced are closed beforehand Especially on Windows where an opened file cannot be replaced, make sure pack-objects always close packs it is about to replace. Even on non Windows systems, this could save potential bad results if ever objects were to be read from the new pack file using offset from the old index. This should fix t5303 on Windows. Signed-off-by: Nicolas Pitre <nico@cam.org> Tested-by: Johannes Sixt <j6t@kdbg.org> (MinGW) Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-12-10 17:56:05 -08:00
Junio C Hamano	0fd9d7e66d	Merge branch 'bc/maint-keep-pack' into maint * bc/maint-keep-pack: repack: only unpack-unreachable if we are deleting redundant packs t7700: test that 'repack -a' packs alternate packed objects pack-objects: extend --local to mean ignore non-local loose objects too sha1_file.c: split has_loose_object() into local and non-local counterparts t7700: demonstrate mishandling of loose objects in an alternate ODB builtin-gc.c: use new pack_keep bitfield to detect .keep file existence repack: do not fall back to incremental repacking with [-a\|-A] repack: don't repack local objects in packs with .keep file pack-objects: new option --honor-pack-keep packed_git: convert pack_local flag into a bitfield and add pack_keep t7700: demonstrate mishandling of objects in packs with a .keep file	2008-12-02 23:00:04 -08:00
Junio C Hamano	455d0f5c23	Merge branch 'maint' * maint: sha1_file.c: resolve confusion EACCES vs EPERM sha1_file: avoid bogus "file exists" error message git checkout: don't warn about unborn branch if -f is already passed bash: offer refs instead of filenames for 'git revert' bash: remove dashed command leftovers git-p4: fix keyword-expansion regex fast-export: use an unsorted string list for extra_refs Add new testcase to show fast-export does not always exports all tags	2008-11-27 19:23:51 -08:00
Sam Vilain	35243577ab	sha1_file.c: resolve confusion EACCES vs EPERM An earlier commit `916d081` (Nicer error messages in case saving an object to db goes wrong, 2006-11-09) confused EACCES with EPERM, the latter of which is an unlikely error from mkstemp(). Signed-off-by: Sam Vilain <sam@vilain.net>	2008-11-27 19:11:21 -08:00
Joey Hess	65117abc04	sha1_file: avoid bogus "file exists" error message This avoids the following misleading error message: error: unable to create temporary sha1 filename ./objects/15: File exists mkstemp can fail for many reasons, one of which, ENOENT, can occur if the directory for the temp file doesn't exist. create_tmpfile tried to handle this case by always trying to mkdir the directory, even if it already existed. This caused errno to be clobbered, so one cannot tell why mkstemp really failed, and it truncated the buffer to just the directory name, resulting in the strange error message shown above. Note that in both occasions that I've seen this failure, it has not been due to a missing directory, or bad permissions, but some other, unknown mkstemp failure mode that did not occur when I ran git again. This code could perhaps be made more robust by retrying mkstemp, in case it was a transient failure. Signed-off-by: Joey Hess <joey@kitenet.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-27 18:48:53 -08:00
Joey Hess	cbacbf4e55	sha1_file: avoid bogus "file exists" error message This avoids the following misleading error message: error: unable to create temporary sha1 filename ./objects/15: File exists mkstemp can fail for many reasons, one of which, ENOENT, can occur if the directory for the temp file doesn't exist. create_tmpfile tried to handle this case by always trying to mkdir the directory, even if it already existed. This caused errno to be clobbered, so one cannot tell why mkstemp really failed, and it truncated the buffer to just the directory name, resulting in the strange error message shown above. Note that in both occasions that I've seen this failure, it has not been due to a missing directory, or bad permissions, but some other, unknown mkstemp failure mode that did not occur when I ran git again. This code could perhaps be made more robust by retrying mkstemp, in case it was a transient failure. Signed-off-by: Joey Hess <joey@kitenet.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-23 19:44:19 -08:00
Alex Riesen	f755bb996b	Fix handle leak in sha1_file/unpack_objects if there were damaged object data In the case of bad packed object CRC, unuse_pack wasn't called after check_pack_crc which calls use_pack. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-23 19:31:05 -08:00
Junio C Hamano	47a792539a	Merge branch 'jk/commit-v-strip' * jk/commit-v-strip: status: show "-v" diff even for initial commit wt-status: refactor initial commit printing define empty tree sha1 as a macro	2008-11-16 00:48:59 -08:00
Junio C Hamano	7b51b77dbc	Merge branch 'np/pack-safer' * np/pack-safer: t5303: fix printf format string for portability t5303: work around printf breakage in dash pack-objects: don't leak pack window reference when splitting packs extend test coverage for latest pack corruption resilience improvements pack-objects: allow "fixing" a corrupted pack without a full repack make find_pack_revindex() aware of the nasty world make check_object() resilient to pack corruptions make packed_object_info() resilient to pack corruptions make unpack_object_header() non fatal better validation on delta base object offsets close another possibility for propagating pack corruption	2008-11-12 22:26:35 -08:00
Junio C Hamano	ecbbfb15a4	Merge branch 'bc/maint-keep-pack' * bc/maint-keep-pack: t7700: test that 'repack -a' packs alternate packed objects pack-objects: extend --local to mean ignore non-local loose objects too sha1_file.c: split has_loose_object() into local and non-local counterparts t7700: demonstrate mishandling of loose objects in an alternate ODB builtin-gc.c: use new pack_keep bitfield to detect .keep file existence repack: do not fall back to incremental repacking with [-a\|-A] repack: don't repack local objects in packs with .keep file pack-objects: new option --honor-pack-keep packed_git: convert pack_local flag into a bitfield and add pack_keep t7700: demonstrate mishandling of objects in packs with a .keep file	2008-11-12 22:00:43 -08:00
Jeff King	14d9c57896	define empty tree sha1 as a macro This can potentially be used in a few places, so let's make it available to all parts of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-12 12:52:21 -08:00
Brandon Casey	0f4dc14ac4	sha1_file.c: split has_loose_object() into local and non-local counterparts Signed-off-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-12 10:29:22 -08:00
Brandon Casey	8d25931d6f	packed_git: convert pack_local flag into a bitfield and add pack_keep pack_keep will be set when a pack file has an associated .keep file. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-12 10:28:08 -08:00
Nicolas Pitre	08698b1e32	make find_pack_revindex() aware of the nasty world It currently calls die() whenever given offset is not found thinking that such thing should never happen. But this offset may come from a corrupted pack whych _could_ happen and not be found. Callers should deal with this possibility gracefully instead. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:35 -08:00
Nicolas Pitre	3d77d8774f	make packed_object_info() resilient to pack corruptions In the same spirit as commit `8eca0b47ff`, let's try to survive a pack corruption by making packed_object_info() able to fall back to alternate packs or loose objects. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:35 -08:00
Nicolas Pitre	09ded04b7e	make unpack_object_header() non fatal It is possible to have pack corruption in the object header. Currently unpack_object_header() simply die() on them instead of letting the caller deal with that gracefully. So let's have unpack_object_header() return an error instead, and find a better name for unpack_object_header_gently() in that context. All callers of unpack_object_header() are ready for it. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:34 -08:00
Nicolas Pitre	d8f325563d	better validation on delta base object offsets In one case, it was possible to have a bad offset equal to 0 effectively pointing a delta onto itself and crashing git after too many recursions. In the other cases, a negative offset could result due to off_t being signed. Catch those. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:34 -08:00
Nicolas Pitre	0e8189e270	close another possibility for propagating pack corruption Abstract -------- With index v2 we have a per object CRC to allow quick and safe reuse of pack data when repacking. This, however, doesn't currently prevent a stealth corruption from being propagated into a new pack when _not_ reusing pack data as demonstrated by the modification to t5302 included here. The Context ----------- The Git database is all checksummed with SHA1 hashes. Any kind of corruption can be confirmed by verifying this per object hash against corresponding data. However this can be costly to perform systematically and therefore this check is often not performed at run time when accessing the object database. First, the loose object format is entirely compressed with zlib which already provide a CRC verification of its own when inflating data. Any disk corruption would be caught already in this case. Then, packed objects are also compressed with zlib but only for their actual payload. The object headers and delta base references are not deflated for obvious performance reasons, however this leave them vulnerable to potentially undetected disk corruptions. Object types are often validated against the expected type when they're requested, and deflated size must always match the size recorded in the object header, so those cases are pretty much covered as well. Where corruptions could go unnoticed is in the delta base reference. Of course, in the OBJ_REF_DELTA case, the odds for a SHA1 reference to get corrupted so it actually matches the SHA1 of another object with the same size (the delta header stores the expected size of the base object to apply against) are virtually zero. In the OBJ_OFS_DELTA case, the reference is a pack offset which would have to match the start boundary of a different base object but still with the same size, and although this is relatively much more "probable" than in the OBJ_REF_DELTA case, the probability is also about zero in absolute terms. Still, the possibility exists as demonstrated in t5302 and is certainly greater than a SHA1 collision, especially in the OBJ_OFS_DELTA case which is now the default when repacking. Again, repacking by reusing existing pack data is OK since the per object CRC provided by index v2 guards against any such corruptions. What t5302 failed to test is a full repack in such case. The Solution ------------ As unlikely as this kind of stealth corruption can be in practice, it certainly isn't acceptable to propagate it into a freshly created pack. But, because this is so unlikely, we don't want to pay the run time cost associated with extra validation checks all the time either. Furthermore, consequences of such corruption in anything but repacking should be rather visible, and even if it could be quite unpleasant, it still has far less severe consequences than actively creating bad packs. So the best compromize is to check packed object CRC when unpacking objects, and only during the compression/writing phase of a repack, and only when not streaming the result. The cost of this is minimal (less than 1% CPU time), and visible only with a full repack. Someone with a stats background could provide an objective evaluation of this, but I suspect that it's bad RAM that has more potential for data corruptions at this point, even in those cases where this extra check is not performed. Still, it is best to prevent a known hole for corruption when recreating object data into a new pack. What about the streamed pack case? Well, any client receiving a pack must always consider that pack as untrusty and perform full validation anyway, hence no such stealth corruption could be propagated to remote repositoryes already. It is therefore worthless doing local validation in that case. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:22:15 -08:00
Junio C Hamano	581000a419	Merge branch 'jc/maint-co-track' into maint * jc/maint-co-track: Enhance hold_lock_file_for_{update,append}() API demonstrate breakage of detached checkout with symbolic link HEAD Fix "checkout --track -b newbranch" on detached HEAD	2008-11-02 13:36:14 -08:00
Junio C Hamano	a157400c97	Merge branch 'jc/maint-co-track' * jc/maint-co-track: Enhance hold_lock_file_for_{update,append}() API demonstrate breakage of detached checkout with symbolic link HEAD Fix "checkout --track -b newbranch" on detached HEAD Conflicts: builtin-commit.c	2008-10-21 17:58:11 -07:00
Junio C Hamano	acd3b9eca8	Enhance hold_lock_file_for_{update,append}() API This changes the "die_on_error" boolean parameter to a mere "flags", and changes the existing callers of hold_lock_file_for_update/append() functions to pass LOCK_DIE_ON_ERROR. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-10-19 12:35:37 -07:00
Junio C Hamano	58e0fa5416	Merge branch 'maint' * maint: Hopefully the final draft release notes update before 1.6.0.3 diff(1): clarify what "T"ypechange status means contrib: update packinfo.pl to not use dashed commands force_object_loose: Fix memory leak tests: shell negation portability fix	2008-10-18 08:26:44 -07:00
Björn Steinbrink	1fb23e6550	force_object_loose: Fix memory leak read_packed_sha1 expectes its caller to free the buffer it returns, which force_object_loose didn't do. This leak is eventually triggered by "git gc", when it is manually invoked or there are too many packs around, making gc totally unusable when there are lots of unreachable objects. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-10-18 06:19:06 -07:00
Brandon Casey	f285a2d7ed	Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer Many call sites use strbuf_init(&foo, 0) to initialize local strbuf variable "foo" which has not been accessed since its declaration. These can be replaced with a static initialization using the STRBUF_INIT macro which is just as readable, saves a function call, and takes up fewer lines. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-12 12:36:19 -07:00
Miklos Vajna	749bc58c5e	Cleanup in sha1_file.c::cache_or_unpack_entry() This patch just removes an unnecessary goto which makes the code easier to read and shorter. Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-09 08:55:42 -07:00
Nicolas Pitre	9126f0091f	fix openssl headers conflicting with custom SHA1 implementations On ARM I have the following compilation errors: CC fast-import.o In file included from cache.h:8, from builtin.h:6, from fast-import.c:142: arm/sha1.h:14: error: conflicting types for 'SHA_CTX' /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here arm/sha1.h:16: error: conflicting types for 'SHA1_Init' /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here arm/sha1.h:17: error: conflicting types for 'SHA1_Update' /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here arm/sha1.h:18: error: conflicting types for 'SHA1_Final' /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here make: *** [fast-import.o] Error 1 This is because openssl header files are always included in git-compat-util.h since commit `684ec6c63c` whenever NO_OPENSSL is not set, which somehow brings in <openssl/sha1.h> clashing with the custom ARM version. Compilation of git is probably broken on PPC too for the same reason. Turns out that the only file requiring openssl/ssl.h and openssl/err.h is imap-send.c. But only moving those problematic includes there doesn't solve the issue as it also includes cache.h which brings in the conflicting local SHA1 header file. As suggested by Jeff King, the best solution is to rename our references to SHA1 functions and structure to something git specific, and define those according to the implementation used. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-02 18:06:56 -07:00
Shawn O. Pearce	1ad6d46235	Merge branch 'jc/alternate-push' * jc/alternate-push: push: receiver end advertises refs from alternate repositories push: prepare sender to receive extended ref information from the receiver receive-pack: make it a builtin is_directory(): a generic helper function	2008-09-25 09:39:24 -07:00
Shawn O. Pearce	58245a5e40	Merge branch 'jc/safe-c-l-d' * jc/safe-c-l-d: safe_create_leading_directories(): make it about "leading" directories	2008-09-25 08:50:01 -07:00
Junio C Hamano	3791f77c28	Merge branch 'maint' * maint: sha1_file: link() returns -1 on failure, not errno Make git archive respect core.autocrlf when creating zip format archives Add new test to demonstrate git archive core.autocrlf inconsistency gitweb: avoid warnings for commits without body Clarified gitattributes documentation regarding custom hunk header. git-svn: fix handling of even funkier branch names git-svn: Always create a new RA when calling do_switch for svn:// git-svn: factor out svnserve test code for later use diff/diff-files: do not use --cc too aggressively	2008-09-18 20:30:12 -07:00
Thomas Rast	e32c0a9c38	sha1_file: link() returns -1 on failure, not errno `5723fe7` (Avoid cross-directory renames and linking on object creation, 2008-06-14) changed the call to use link() directly instead of through a custom wrapper, but forgot that it returns 0 or -1, not 0 or errno. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-18 19:51:13 -07:00
Junio C Hamano	d79796bcf0	push: receiver end advertises refs from alternate repositories Earlier, when pushing into a repository that borrows from alternate object stores, we followed the longstanding design decision not to trust refs in the alternate repository that houses the object store we are borrowing from. If your public repository is borrowing from Linus's public repository, you pushed into it long time ago, and now when you try to push your updated history that is in sync with more recent history from Linus, you will end up sending not just your own development, but also the changes you acquired through Linus's tree, even though the objects needed for the latter already exists at the receiving end. This is because the receiving end does not advertise that the objects only reachable from the borrowed repository (i.e. Linus's) are already available there. This solves the issue by making the receiving end advertise refs from borrowed repositories. They are not sent with their true names but with a phoney name ".have" to make sure that the old senders will safely ignore them (otherwise, the old senders will misbehave, trying to push matching refs, and mirror push that deletes refs that only exist at the receiving end). Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-09 09:27:46 -07:00
Junio C Hamano	90b4a71c49	is_directory(): a generic helper function A simple "grep -e stat --and -e S_ISDIR" revealed there are many open-coded implementations of this function. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-09 09:27:45 -07:00
Junio C Hamano	5f0bdf50c2	safe_create_leading_directories(): make it about "leading" directories We used to allow callers to pass "foo/bar/" to make sure both "foo" and "foo/bar" exist and have good permissions, but this interface is too error prone. If a caller mistakenly passes a path with trailing slashes (perhaps it forgot to verify the user input) even when it wants to later mkdir "bar" itself, it will find that it cannot mkdir "bar". If such a caller does not bother to check the error for EEXIST, it may even errorneously die(). Because we have no existing callers to use that obscure feature, this patch removes it to avoid confusion. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-03 22:35:32 -07:00
Junio C Hamano	5a1e8707a6	Merge branch 'np/verify-pack' * np/verify-pack: discard revindex data when pack list changes	2008-08-27 16:39:46 -07:00
Nicolas Pitre	4b480c6716	discard revindex data when pack list changes This is needed to fix verify-pack -v with multiple pack arguments. Also, in theory, revindex data (if any) must be discarded whenever reprepare_packed_git() is called. In practice this is hard to trigger though. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-08-22 22:00:22 -07:00
Junio C Hamano	d8eec50468	Merge branch 'dp/hash-literally' * dp/hash-literally: add --no-filters option to git hash-object add --path option to git hash-object use parse_options() in git hash-object correct usage help string for git-hash-object correct argument checking test for git hash-object teach index_fd to work with pipes	2008-08-19 21:43:25 -07:00
Steven Grimm	ddd63e64e4	Optimize sha1_object_info for loose objects, not concurrent repacks When dealing with a repository with lots of loose objects, sha1_object_info would rescan the packs directory every time an unpacked object was referenced before finally giving up and looking for the loose object. This caused a lot of extra unnecessary system calls during git pack-objects; the code was rereading the entire pack directory once for each loose object file. This patch looks for a loose object before falling back to rescanning the pack directory, rather than the other way around. Signed-off-by: Steven Grimm <koreth@midwinter.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-08-05 21:21:20 -07:00
Dmitry Potapov	43df4f86e0	teach index_fd to work with pipes index_fd can now work with file descriptors that are not normal files but any readable file. If the given file descriptor is a regular file then mmap() is used; for other files, strbuf_read is used. The path parameter, which has been used as hint for filters, can be NULL now to indicate that the file should be hashed literally without any filter. The index_pipe function is removed as redundant. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-08-03 13:14:35 -07:00
Nicolas Pitre	ac9391093f	restore legacy behavior for read_sha1_file() Since commit `8eca0b47ff`, it is possible for read_sha1_file() to return NULL even with existing objects when they are corrupted. Previously a corrupted object would have terminated the program immediately, effectively making read_sha1_file() return NULL only when specified object is not found. Let's restore this behavior for all users of read_sha1_file() and provide a separate function with the ability to not terminate when bad objects are encountered. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-07-14 23:35:32 -07:00
Junio C Hamano	948e7471e0	Merge branch 'sp/maint-pack-memuse' * sp/maint-pack-memuse: Correct pack memory leak causing git gc to try to exceed ulimit Conflicts: sha1_file.c	2008-07-09 14:46:46 -07:00
Shawn O. Pearce	eac12e2d4d	Correct pack memory leak causing git gc to try to exceed ulimit When recursing to unpack a delta base we must unuse_pack() so that the pack window for the current object does not remain pinned in memory while the delta base is itself being unpacked and materialized for our use. On a long delta chain of 50 objects we may need to access 6 different windows from a very large (>3G) pack file in order to obtain all of the delta base content. If the process ulimit permits us to map/allocate only 1.5G we must release windows during this recursion to ensure we stay within the ulimit and transition memory from pack cache to standard malloc, or other mmap needs. Inserting an unuse_pack() call prior to the recursion allows us to avoid pinning the current window, making it available for garbage collection if memory runs low. This has been broken since at least before 1.5.1-rc1, and very likely earlier than that. Its fixed now. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-07-09 14:45:42 -07:00
Ramsay Jones	6e1c23442a	Fix some warnings (on cygwin) to allow -Werror When printing valuds of type uint32_t, we should use PRIu32, and should not assume that it is unsigned int. On 32-bit platforms, it could be defined as unsigned long. The same caution applies to ntohl(). Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-07-05 17:26:29 -07:00
Junio C Hamano	bb1ab2db08	Merge branch 'j6t/mingw' * j6t/mingw: (38 commits) compat/pread.c: Add a forward declaration to fix a warning Windows: Fix ntohl() related warnings about printf formatting Windows: TMP and TEMP environment variables specify a temporary directory. Windows: Make 'git help -a' work. Windows: Work around an oddity when a pipe with no reader is written to. Windows: Make the pager work. When installing, be prepared that template_dir may be relative. Windows: Use a relative default template_dir and ETC_GITCONFIG Windows: Compute the fallback for exec_path from the program invocation. Turn builtin_exec_path into a function. Windows: Use a customized struct stat that also has the st_blocks member. Windows: Add a custom implementation for utime(). Windows: Add a new lstat and fstat implementation based on Win32 API. Windows: Implement a custom spawnve(). Windows: Implement wrappers for gethostbyname(), socket(), and connect(). Windows: Work around incompatible sort and find. Windows: Implement asynchronous functions as threads. Windows: Disambiguate DOS style paths from SSH URLs. Windows: A rudimentary poll() emulation. Windows: Implement start_command(). ...	2008-07-02 21:57:52 -07:00
Junio C Hamano	abf7e0df17	Merge branch 'lt/config-fsync' * lt/config-fsync: Add config option to enable 'fsync()' of object files Split up default "i18n" and "branch" config parsing into helper routines Split up default "user" config parsing into helper routine Split up default "core" config parsing into helper routine	2008-06-25 13:19:49 -07:00
Jeff King	2beebd22f4	clone: create intermediate directories of destination repo The shell version used to use "mkdir -p" to create the repo path, but the C version just calls "mkdir". Let's replicate the old behavior. We have to create the git and worktree leading dirs separately; while most of the time, the worktree dir contains the git dir (as .git), the user can override this using GIT_WORK_TREE. We can reuse safe_create_leading_directories, but we need to make a copy of our const buffer to do so. Since merge-recursive uses the same pattern, we can factor this out into a global function. This has two other cleanup advantages for merge-recursive: 1. mkdir_p wasn't a very good name. "mkdir -p foo/bar" actually creates bar, but this function just creates the leading directories. 2. mkdir_p took a mode argument, but it was completely ignored. Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-25 11:44:15 -07:00
Nicolas Pitre	99093238bb	optimize verify-pack a bit Using find_pack_entry_one() to get object offsets is rather suboptimal when nth_packed_object_offset() can be used directly. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-24 23:58:57 -07:00
Jeff King	8e21d63b02	clone: create intermediate directories of destination repo The shell version used to use "mkdir -p" to create the repo path, but the C version just calls "mkdir". Let's replicate the old behavior. We have to create the git and worktree leading dirs separately; while most of the time, the worktree dir contains the git dir (as .git), the user can override this using GIT_WORK_TREE. We can reuse safe_create_leading_directories, but we need to make a copy of our const buffer to do so. Since merge-recursive uses the same pattern, we can factor this out into a global function. This has two other cleanup advantages for merge-recursive: 1. mkdir_p wasn't a very good name. "mkdir -p foo/bar" actually creates bar, but this function just creates the leading directories. 2. mkdir_p took a mode argument, but it was completely ignored. Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-24 23:23:21 -07:00
Nicolas Pitre	27d69a465d	refactor pack structure allocation New pack structures are currently allocated in 2 different places and all members have to be initialized explicitly. This is prone to errors leading to segmentation faults as found by Teemu Likonen. Let's have a common place where this structure is allocated, and have all members explicitly initialized to zero. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-24 17:03:44 -07:00
Nicolas Pitre	8eca0b47ff	implement some resilience against pack corruptions We should be able to fall back to loose objects or alternative packs when a pack becomes corrupted. This is especially true when an object exists in one pack only as a delta but its base object is corrupted. Currently there is no way to retrieve the former object even if the later is available in another pack or loose. This patch allows for a delta to be resolved (with a performance cost) using a base object from a source other than the pack where that delta is located. Same thing for non-delta objects: rather than failing outright, a search is made in other packs or used loose when the currently active pack has it but corrupted. Of course git will become extremely noisy with error messages when that happens. However, if the operation succeeds nevertheless, a simple 'git repack -a -f -d' will "fix" the corrupted repository given that all corrupted objects have a good duplicate somewhere in the object store, possibly manually copied from another source. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-23 21:29:33 -07:00
Patrick Higgins	6ff6af62ec	Workaround for AIX mkstemp() The AIX mkstemp will modify it's template parameter to an empty string if the call fails. This caused a subsequent mkdir to fail. Signed-off-by: Patrick Higgins <patrick.higgins@cexp.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-23 16:13:38 -07:00
Johannes Sixt	8385abfda5	Windows: Handle absolute paths in safe_create_leading_directories(). In this function we must be careful to handle drive-local paths else there is a danger that it runs into an infinite loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>	2008-06-23 13:30:27 +02:00
Johannes Sixt	80ba074f41	Windows: Use the Windows style PATH separator ';'. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>	2008-06-22 11:32:45 +02:00
Linus Torvalds	aafe9fbaf4	Add config option to enable 'fsync()' of object files As explained in the documentation[] this is totally useless on filesystems that do ordered/journalled data writes, but it can be a useful safety feature on filesystems like HFS+ that only journal the metadata, not the actual file contents. It defaults to off, although we could presumably in theory some day auto-enable it on a per-filesystem basis. [] Yes, I updated the docs for the thing. Hell really _has_ frozen over, and the four horsemen are probably just beyond the horizon. EVERYBODY PANIC! Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-18 16:50:35 -07:00
Junio C Hamano	79c6dca413	sha1_file.c: simplify parse_pack_index() It was implemented as a thin wrapper around an otherwise unused helper function parse_pack_index_file(). The code becomes simpler and easier to read by consolidating the two. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-16 22:19:00 -07:00
Junio C Hamano	3bfaf01857	create_tempfile: make sure that leading directories can be accessible by peers In a shared repository, we should make sure adjust_shared_perm() is called after creating the initial fan-out directories under objects/ directory. Earlier an logico called the function only when mkdir() failed; we should do so when mkdir() succeeded. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-16 22:02:12 -07:00
Linus Torvalds	1421c5f274	write_loose_object: don't bother trying to read an old object Before even calling this, all callers have done a "has_sha1_file(sha1)" or "has_loose_object(sha1)" check, so there is no point in doing a second check. If something races with us on object creation, we handle that in the final link() that moves it to the right place. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-16 21:46:47 -07:00
Linus Torvalds	c529d75a75	Simplify and rename find_sha1_file() Now that we've made the loose SHA1 file reading more careful and streamlined, we only use the old find_sha1_file() function for checking whether a loose object file exists at all. As such, the whole 'return stat information' part of it was just pointless (nobody cares any more), and the naming of the function is not really all that relevant either. So simplify it to not do a 'stat()', but just an existence check (which is what the callers want), and rename it to 'has_loose_object()' which matches the use. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-14 14:39:22 -07:00
Linus Torvalds	44d1c19ee8	Make loose object file reading more careful We used to do 'stat()+open()+mmap()+close()' to read the loose object file data, which does work fine, but has a couple of problems: - it unnecessarily walks the filename twice (at 'stat()' time and then again to open it) - NFS generally has open-close consistency guarantees, which means that the initial 'stat()' was technically done outside of the normal consistency rules. So change it to do 'open()+fstat()+mmap()+close()' instead, which avoids both these issues. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-14 14:39:22 -07:00
Linus Torvalds	5723fe7e3c	Avoid cross-directory renames and linking on object creation Instead of creating new temporary objects in the top-level git object directory, create them in the same directory they will finally end up in anyway. This avoids making the final atomic "rename to stable name" operation be a cross-directory event, which makes it a lot easier for various filesystems. Several filesystems do things like change the inode number when moving files across directories (or refuse to do it entirely). In particular, it can also cause problems for NFS implementations that change the filehandle of a file when it moves to a different directory, like the old user-space NFS server did, and like the Linux knfsd still does if you don't export your filesystems with 'no_subtree_check' or if you export a filesystem that doesn't have stable inode numbers across renames). This change also obviously implies creating the object fan-out subdirectory at tempfile creation time, rather than at the final move_temp_to_file() time. Which actually accounts for most of the size of the patch. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-14 14:39:22 -07:00
Junio C Hamano	6483925999	sha1_file.c: dead code removal write_sha1_from_fd() and write_sha1_to_fd() were dead code nobody called, neither the latter's helper repack_object() was. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-13 23:00:51 -07:00
Linus Torvalds	e9039dd351	Consolidate SHA1 object file close This consolidates the common operations for closing the new temporary file that we have written, before we move it into place with the final name. There's some common code there (make it read-only and check for errors on close), but more importantly, this also gives a single place to add an fsync_or_die() call if we want to add a safe mode. This was triggered due to Denis Bueno apparently twice being able to corrupt his git repository on OS X due to an unlucky combination of kernel crashes and a not-very-robust filesystem. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-06-10 22:23:18 -07:00
Junio C Hamano	6eec46bdda	fix sha1_pack_index_name() An earlier commit `633f43e` (Remove redundant code, eliminate one static variable, 2008-05-24) had a thinko (perhaps an eyeno) that broke sha1_pack_index_name() function. One symptom of this was that the http walker is now completely broken. This should fix it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-28 10:24:32 -07:00
Junio C Hamano	b84c343c88	Merge branch 'db/clone-in-c' * db/clone-in-c: Add test for cloning with "--reference" repo being a subset of source repo Add a test for another combination of --reference Test that --reference actually suppresses fetching referenced objects clone: fall back to copying if hardlinking fails builtin-clone.c: Need to closedir() in copy_or_link_directory() builtin-clone: fix initial checkout Build in clone Provide API access to init_db() Add a function to set a non-default work tree Allow for having for_each_ref() list extra refs Have a constant extern refspec for "--tags" Add a library function to add an alternate to the alternates file Add a lockfile function to append to a file Mark the list of refs to fetch as const Conflicts: cache.h t/t5700-clone-reference.sh	2008-05-25 13:41:37 -07:00
Heikki Orsila	633f43e1f7	Remove redundant code, eliminate one static variable Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-24 22:05:06 -07:00
Nicolas Pitre	bbac73117e	add a force_object_loose() function This is meant to force the creation of a loose object even if it already exists packed. Needed for the next commit. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-13 22:42:33 -07:00
Daniel Barkalow	bef70b22ba	Add a library function to add an alternate to the alternates file This is in the core so that, if the alternates file has already been read, the addition can be parsed and put into effect for the current process. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-04 17:41:44 -07:00
Heikki Orsila	c697ad143b	Cleanup xread() loops to use read_in_full() Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-05-03 22:15:25 -07:00
Junio C Hamano	628522ec14	sha1-lookup: more memory efficient search in sorted list of SHA-1 Currently, when looking for a packed object from the pack idx, a simple binary search is used. A conventional binary search loop looks like this: unsigned lo, hi; do { unsigned mi = (lo + hi) / 2; int cmp = "entry pointed at by mi" minus "target"; if (!cmp) return mi; "mi is the wanted one" if (cmp > 0) hi = mi; "mi is larger than target" else lo = mi+1; "mi is smaller than target" } while (lo < hi); "did not find what we wanted" The invariants are: - When entering the loop, 'lo' points at a slot that is never above the target (it could be at the target), 'hi' points at a slot that is guaranteed to be above the target (it can never be at the target). - We find a point 'mi' between 'lo' and 'hi' ('mi' could be the same as 'lo', but never can be as high as 'hi'), and check if 'mi' hits the target. There are three cases: - if it is a hit, we have found what we are looking for; - if it is strictly higher than the target, we set it to 'hi', and repeat the search. - if it is strictly lower than the target, we update 'lo' to one slot after it, because we allow 'lo' to be at the target and 'mi' is known to be below the target. If the loop exits, there is no matching entry. When choosing 'mi', we do not have to take the "middle" but anywhere in between 'lo' and 'hi', as long as lo <= mi < hi is satisfied. When we somehow know that the distance between the target and 'lo' is much shorter than the target and 'hi', we could pick 'mi' that is much closer to 'lo' than (hi+lo)/2, which a conventional binary search would pick. This patch takes advantage of the fact that the SHA-1 is a good hash function, and as long as there are enough entries in the table, we can expect uniform distribution. An entry that begins with for example "deadbeef..." is much likely to appear much later than in the midway of a reasonably populated table. In fact, it can be expected to be near 87% (222/256) from the top of the table. This is a work-in-progress and has switches to allow easier experiments and debugging. Exporting GIT_USE_LOOKUP environment variable enables this code. On my admittedly memory starved machine, with a partial KDE repository (3.0G pack with 95M idx): $ GIT_USE_LOOKUP=t git log -800 --stat HEAD >/dev/null 3.93user 0.16system 0:04.09elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+55588minor)pagefaults 0swaps Without the patch, the numbers are: $ git log -800 --stat HEAD >/dev/null 4.00user 0.15system 0:04.17elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+60258minor)pagefaults 0swaps In the same repository: $ GIT_USE_LOOKUP=t git log -2000 HEAD >/dev/null 0.12user 0.00system 0:00.12elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+4241minor)pagefaults 0swaps Without the patch, the numbers are: $ git log -2000 HEAD >/dev/null 0.05user 0.01system 0:00.07elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+8506minor)pagefaults 0swaps There isn't much time difference, but the number of minor faults seems to show that we are touching much smaller number of pages, which is expected. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:23:52 -07:00
Nicolas Pitre	70f5d5d31c	fix unimplemented packed_object_info_detail() features Since commit `eb32d236df`, there was a TODO comment in packed_object_info_detail() about the SHA1 of base object to OBJ_OFS_DELTA objects. So here it is at last. While at it, providing the actual storage size information as well is now trivial. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-01 01:44:46 -08:00
Junio C Hamano	c484166374	Merge branch 'jk/empty-tree' * jk/empty-tree: add--interactive: handle initial commit better hard-code the empty tree object	2008-02-20 16:13:28 -08:00
Junio C Hamano	ee4f06c0a6	Merge branch 'mk/maint-parse-careful' * mk/maint-parse-careful: peel_onion: handle NULL check return value from parse_commit() in various functions parse_commit: don't fail, if object is NULL revision.c: handle tag->tagged == NULL reachable.c::process_tree/blob: check for NULL process_tag: handle tag->tagged == NULL check results of parse_commit in merge_bases list-objects.c::process_tree/blob: check for NULL reachable.c::add_one_tree: handle NULL from lookup_tree mark_blob/tree_uninteresting: check for NULL get_sha1_oneline: check return value of parse_object read_object_with_reference: don't read beyond the buffer	2008-02-18 20:56:01 -08:00
Martin Koegler	50974ec994	read_object_with_reference: don't read beyond the buffer Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-18 19:20:17 -08:00
Jeff King	346245a1bb	hard-code the empty tree object Now any commands may reference the empty tree object by its sha1 (`4b825dc642`). This is useful for showing some diffs, especially for initial commits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-13 13:44:17 -08:00

1 2 3 4 5 ...

554 Commits