git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	3c85452bb0	Merge branch 'rs/ref-transaction' The API to update refs have been restructured to allow introducing a true transactional updates later. We would even allow storing refs in backends other than the traditional filesystem-based one. * rs/ref-transaction: (25 commits) ref_transaction_commit: bail out on failure to remove a ref lockfile: remove unable_to_lock_error refs.c: do not permit err == NULL remote rm/prune: print a message when writing packed-refs fails for-each-ref: skip and warn about broken ref names refs.c: allow listing and deleting badly named refs test: put tests for handling of bad ref names in one place packed-ref cache: forbid dot-components in refnames branch -d: simplify by using RESOLVE_REF_READING branch -d: avoid repeated symref resolution reflog test: test interaction with detached HEAD refs.c: change resolve_ref_unsafe reading argument to be a flags field refs.c: make write_ref_sha1 static fetch.c: change s_update_ref to use a ref transaction refs.c: ref_transaction_commit: distinguish name conflicts from other errors refs.c: pass a list of names to skip to is_refname_available refs.c: call lock_ref_sha1_basic directly from commit refs.c: refuse to lock badly named refs in lock_ref_sha1_basic rename_ref: don't ask read_ref_full where the ref came from refs.c: pass the ref log message to _create/delete/update instead of _commit ...	2014-10-21 13:28:10 -07:00
Junio C Hamano	f13f9b0eab	mergetool: rename bc3 to bc Beyond Compare version 4 works the same way as version 3, so rename the existing "bc3" adaptor to just "bc", while keeping "bc3" as a backward compatible wrapper. Noticed-by: Olivier Croquette <ocroquette@free.fr> Helped-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-21 11:25:30 -07:00
Nguyễn Thái Ngọc Duy	03e11a715b	dir.c: remove the second declaration of "stk" in prep_exclude() This "stk" shadows the first declaration at the top. There's currently no bad effect. But let's avoid it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-21 11:22:00 -07:00
Stefan Beller	8b148bf085	.mailmap: add Stefan Bellers corporate mail address Note that despite the private address being first and primary, Google owns the copyright on this patch as any other patch I'll be sending signed off by the sbeller@google.com address. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-21 11:01:27 -07:00
Stefan Beller	dc76c7f783	transport: free leaking head in transport_print_push_status() Found by scan.coverity.com (ID: 1248110) Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-21 11:01:18 -07:00
Junio C Hamano	13da0fc092	Update draft release notes to 2.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-20 13:07:32 -07:00
Junio C Hamano	9d04401ffe	Merge branch 'cc/interpret-trailers' A new filter to programatically edit the tail end of the commit log messages. * cc/interpret-trailers: Documentation: add documentation for 'git interpret-trailers' trailer: add tests for commands in config file trailer: execute command from 'trailer.<name>.command' trailer: add tests for "git interpret-trailers" trailer: add interpret-trailers command trailer: put all the processing together and print trailer: parse trailers from file or stdin trailer: process command line trailer arguments trailer: read and process config information trailer: process trailers from input message and arguments trailer: add data structures and basic functions	2014-10-20 12:25:32 -07:00
Junio C Hamano	7df3b072a9	Merge branch 'rm/gitweb-start-form' * rm/gitweb-start-form: gitweb: use start_form, not startform that was removed in CGI.pm 4.04	2014-10-20 12:25:27 -07:00
Junio C Hamano	9c6be8b5ab	Merge branch 'ss/contrib-subtree-contacts' * ss/contrib-subtree-contacts: contacts: add a Makefile to generate docs and install subtree: add an install-html target	2014-10-20 12:25:16 -07:00
Junio C Hamano	b946576839	Merge branch 'jn/parse-config-slot' Code cleanup. * jn/parse-config-slot: color_parse: do not mention variable name in error message pass config slots as pointers instead of offsets	2014-10-20 12:23:48 -07:00
Junio C Hamano	b67588d018	Merge branch 'rs/receive-pack-argv-leak-fix' * rs/receive-pack-argv-leak-fix: receive-pack: plug minor memory leak in unpack()	2014-10-20 12:23:45 -07:00
Junio C Hamano	713ee7fe46	Merge branch 'ta/config-set' * ta/config-set: t1308: fix broken here document in test script	2014-10-20 12:23:43 -07:00
Junio C Hamano	f9a2fd3616	Merge branch 'jk/test-shell-trace' Test scripts were taught to notice "-x" option to show shell trace, as if the tests were run under "sh -x". * jk/test-shell-trace: test-lib.sh: support -x option for shell-tracing t5304: use helper to report failure of "test foo = bar" t5304: use test_path_is_* instead of "test -f"	2014-10-20 12:23:40 -07:00
Junio C Hamano	6459cf8cdf	Merge branch 'bc/asciidoc' Formatting nitpicks to help a (pickier) reimplementation of AsciiDoc to grok our documentation. * bc/asciidoc: Documentation: fix mismatched delimiters in git-imap-send Documentation: adjust document title underlining	2014-10-20 12:23:30 -07:00
Junio C Hamano	15c6ef7b06	Revert "archive: honor tar.umask even for pax headers" This reverts commit `10f343ea81`, whose output is no longer bit-for-bit equivalent from the older versions of Git, which the infrastructure to (pretend to) upload tarballs kernel.org uses depends on.	2014-10-20 12:04:46 -07:00
Torsten Bögershausen	ecdab41267	core.filemode may need manual action core.filemode is set automatically when a repo is created. But when a repo is exported via CIFS or cygwin is mixed with Git for Windows or Eclipse core.filemode may better be set manually to false. Update and improve the documentation Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 20:47:40 -07:00
Philip Oakley	7c45cee6bb	doc: fix 'git status --help' character quoting Correct backtick quoting for some of the modification states to give consistent web rendering. Signed-off-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 20:45:16 -07:00
W. Trevor King	7544b2e2da	t1304: Set LOGNAME even if USER is unset or null Avoid: # ./t1304-default-acl.sh ok 1 - checking for a working acl setup ok 2 - Setup test repo not ok 3 - Objects creation does not break ACLs with restrictive umask # # # SHA1 for empty blob # check_perms_and_acl .git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 # not ok 4 - git gc does not break ACLs with restrictive umask # # git gc && # check_perms_and_acl .git/objects/pack/*.pack # # failed 2 among 4 test(s) 1..4 on systems where USER isn't set. It's usually set by the login process, but it isn't set when launching some Docker images. For example: $ docker run --rm debian env HOME=/ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=b2dfdfe797ed 'id -u -n' has been in POSIX from Issue 2 through 2013 [1], so I don't expect compatibility issues. [1]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/id.html Signed-off-by: W. Trevor King <wking@tremily.us> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:47:20 -07:00
Etienne Buira	0f4b6db3ba	Handle atexit list internaly for unthreaded builds Wrap atexit()s calls on unthreaded builds to handle callback list internally. This is needed because on unthreaded builds, asyncs inherits parent's atexit() list, that gets run as soon as the async exit()s (and again at the end of async's parent process). That led to remove temporary files too early. Also remove a by-atexit-callback guard against this kind of issue in clone.c, as this patch makes it redundant. Fixes test 5537 (temporary shallow file vanished before unpack-objects could open it) BTW remove an unused variable in shallow.c. Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Andreas Schwab <schwab@linux-m68k.org> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Etienne Buira <etienne.buira@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:38:30 -07:00
Jeff King	189a122249	drop add_object_array_with_mode This is a thin compatibility wrapper around add_pending_object_with_path. But the only caller is add_object_array, which is itself just a thin compatibility wrapper. There are no external callers, so we can just remove this middle wrapper. Noticed-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:28:30 -07:00
Ramsay Jones	d7702be1e1	revision: remove definition of unused 'add_object' function Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:27:24 -07:00
René Scharfe	a915459097	use env_array member of struct child_process Convert users of struct child_process to using the managed env_array for specifying environment variables instead of supplying an array on the stack or bringing their own argv_array. This shortens and simplifies the code and ensures automatically that the allocated memory is freed after use. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:26:34 -07:00
René Scharfe	19a583dc39	run-command: add env_array, an optional argv_array for env Similar to args, add a struct argv_array member to struct child_process that simplifies specifying the environment for children. It is freed automatically by finish_command() or if start_command() encounters an error. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:26:31 -07:00
Jeff King	2113471478	pack-objects: turn off bitmaps when we split packs If a pack.packSizeLimit is set, we may split the pack data across multiple packfiles. This means we cannot generate .bitmap files, as they require that all of the reachable objects are in the same pack. We check that condition when we are generating the list of objects to pack (and disable bitmaps if we are not packing everything), but we forgot to update it when we notice that we needed to split (which doesn't happen until the actual write phase). The resulting bitmaps are quite bogus (they mention entries that do not exist in the pack!) and can cause a fetch or push to send insufficient objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:08:38 -07:00
Jeff King	b1e757f363	pack-objects: double-check options before discarding objects When we are given an expiration time like --unpack-unreachable=2.weeks.ago, we avoid writing out old, unreachable loose objects entirely, under the assumption that running "prune" would simply delete them immediately anyway. However, this is only valid if we computed the same set of reachable objects as prune would. In practice, this is the case, because only git-repack uses the --unpack-unreachable option with an expiration, and it always feeds as many objects into the pack as possible. But we can double-check at runtime just to be sure. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:07 -07:00
Jeff King	c90f9e13ab	repack: pack objects mentioned by the index When we pack all objects, we use only the objects reachable from references and reflogs. This misses any objects which are reachable from the index, but not yet referenced. By itself this isn't a big deal; the objects can remain loose until they are actually used in a commit. However, it does create a problem when we drop packed but unreachable objects. We try to optimize out the writing of objects that we will immediately prune, which means we must follow the same rules as prune in determining what is reachable. And prune uses the index for this purpose. This is rather uncommon in practice, as objects in the index would not usually have been packed in the first place. But it could happen in a sequence like: 1. You make a commit on a branch that references blob X. 2. You repack, moving X into the pack. 3. You delete the branch (and its reflog), so that X is unreferenced. 4. You "git add" blob X so that it is now referenced only by the index. 5. You repack again with git-gc. The pack-objects we invoke will see that X is neither referenced nor recent and not bother loosening it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:07 -07:00
Jeff King	edfbb2aa53	pack-objects: use argv_array This saves us from having to bump the rp_av count when we add new traversal options. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:07 -07:00
Jeff King	1be111d88f	reachable: use revision machinery's --indexed-objects code This does the same thing as our custom code, so let's not repeat ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:07 -07:00
Jeff King	4fe10219bc	rev-list: add --indexed-objects option There is currently no easy way to ask the revision traversal machinery to include objects reachable from the index (e.g., blobs and trees that have not yet been committed). This patch adds an option to do so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:07 -07:00
Jeff King	41d018d146	rev-list: document --reflog option This is mostly used internally, but it does not hurt to explain it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:07 -07:00
Jeff King	458a7e508c	t5516: test pushing a tag of an otherwise unreferenced blob It's not unreasonable to have a tag that points to a blob that is not part of the normal history. We do this in git.git to distribute gpg keys. However, we never explicitly checked in our test suite that this actually works (i.e., that pack-objects actually sends the blob because of the tag mentioning it). It does in fact work fine, but a recent patch under discussion broke this, and the test suite didn't notice. Let's make the test suite more complete. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:06 -07:00
Jeff King	207394908e	traverse_commit_list: support pending blobs/trees with paths When we call traverse_commit_list, we may have trees and blobs in the pending array. As we process these, we pass the "name" field from the pending entry as the path of the object within the tree (which then becomes the root path if we recurse into subtrees). When we set up the traversal in prepare_revision_walk, though, the "name" field of any pending trees and blobs is likely to be the ref at which we found the object. We would not want to make this part of the path (e.g., doing so would make "git rev-list --objects v2.6.11-tree" in linux.git show paths like "v2.6.11-tree/Makefile", which is nonsensical). Therefore prepare_revision_walk sets the name field of each pending tree and blobs to the empty string. However, this leaves no room for a caller who does know the correct path of a pending object to propagate that information to the revision walker. We can fix this by making two related changes: 1. Use the "path" field as the path instead of the "name" field in traverse_commit_list. If the path is not set, default to "" (which is what we always ended up with in the current code, because of prepare_revision_walk). 2. In prepare_revision_walk, make a complete copy of the entry. This makes the path field available to the walker (if there is one), solving our problem. Leaving the name field intact is now OK, as we do not use it as a path due to point (1) above (and we can use it to make more meaningful error messages if we want). We also make the original "mode" field available to the walker, though it does not actually use it. Note that we still re-add the pending objects and free the old ones (so we may strdup the path and name only to free the old ones). This could be made more efficient by simply copying the object_array entries that we are keeping. However, that would require more restructuring of the code, and is not done here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:06:31 -07:00
Junio C Hamano	98349e5364	Merge branch 'jc/completion-no-chdir' * jc/completion-no-chdir: completion: use "git -C $there" instead of (cd $there && git ...)	2014-10-16 14:16:49 -07:00
Junio C Hamano	c11dc64722	Merge branch 'bw/trace-no-inline-getnanotime' No file-scope static variables in an inlined function, please. * bw/trace-no-inline-getnanotime: trace.c: do not mark getnanotime() as "inline"	2014-10-16 14:16:45 -07:00
Junio C Hamano	1cb3324e61	Merge branch 'po/everyday-doc' "git help everyday" to show the Everyday Git document. * po/everyday-doc: doc: add 'everyday' to 'git help' doc: Makefile regularise OBSOLETE_HTML list building doc: modernise everyday.txt wording and format in man page style	2014-10-16 14:16:42 -07:00
Roland Mas	4750f4b962	gitweb: use start_form, not startform that was removed in CGI.pm 4.04 CGI.pm 4.04 removed the startform method, which had previously been deprecated in favour of start_form. Changes file for CGI.pm says: 4.04 2014-09-04 [ REMOVED / DEPRECATIONS ] - startform and endform methods removed (previously deprecated, you should be using the start_form and end_form methods) Signed-off-by: Roland Mas <lolando@debian.org> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 13:12:34 -07:00
David Aguilar	688684eba4	t7610-mergetool: add test cases for mergetool.writeToTemp Add tests to ensure that filenames start with "./" when mergetool.writeToTemp is false and do not start with "./" when mergetool.writeToTemp is true. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 12:09:51 -07:00
David Aguilar	8f0cb41da2	mergetool: add an option for writing to a temporary directory Teach mergetool to write files in a temporary directory when 'mergetool.writeToTemp' is true. This is helpful for tools such as Eclipse which cannot cope with multiple copies of the same file in the worktree. Suggested-by: Charles Bailey <charles@hashpling.org> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 12:09:51 -07:00
David Aguilar	eab335c46d	mergetool: use more conservative temporary filenames Avoid filenames with multiple dots so that overly-picky tools do not misinterpret their extension. Previously, foo/bar.ext in the worktree would result in e.g. ./foo/bar.ext.BASE.1234.ext This can be improved by having only a single .ext and using underscore instead of dot so that the extension cannot be misinterpreted. The resulting path becomes: ./foo/bar_BASE_1234.ext Suggested-by: Sergio Ferrero <sferrero@ensoftcorp.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 12:07:48 -07:00
David Aguilar	9e8f8dea46	test-lib-functions: adjust style to match CodingGuidelines Prefer "test" over "[ ]" for conditionals. Prefer "$()" over backticks for command substitutions. Avoid control structures on a single line with semicolons. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 12:04:05 -07:00
David Aguilar	d7d300ea59	t7610-mergetool: use test_config to isolate tests Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 12:03:21 -07:00
David Aguilar	b12d04503b	mergetools/meld: make usage of `--output` configurable and more robust Older versions of meld listed --output in `meld --help`. Newer versions only mention `meld [OPTIONS...]`. Improve the checks to catch these newer versions. Add a `mergetool.meld.hasOutput` configuration to allow overriding the heuristic. Reported-by: Andrey Novoseltsev <novoselt@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 11:58:11 -07:00
Jeff King	9e0c3c4fcd	make add_object_array_with_context interface more sane When you resolve a sha1, you can optionally keep any context found during the resolution, including the path and mode of a tree entry (e.g., when looking up "HEAD:subdir/file.c"). The add_object_array_with_context function lets you then attach that context to an entry in a list. Unfortunately, the interface for doing so is horrible. The object_context structure is large and most object_array users do not use it. Therefore we keep a pointer to the structure to avoid burdening other users too much. But that means when we do use it that we must allocate the struct ourselves. And the struct contains a fixed PATH_MAX-sized buffer, which makes this wholly unsuitable for any large arrays. We can observe that there is only a single user of the "with_context" variant: builtin/grep.c. And in that use case, the only element we care about is the path. We can therefore store only the path as a pointer (the context's mode field was redundant with the object_array_entry itself, and nobody actually cared about the surrounding tree). This still requires a strdup of the pathname, but at least we are only consuming the minimum amount of memory for each string. We can also handle the copying ourselves in add_object_array_*, and free it as appropriate in object_array_release_entry. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:44 -07:00
Jeff King	33d4221c79	write_sha1_file: freshen existing objects When we try to write a loose object file, we first check whether that object already exists. If so, we skip the write as an optimization. However, this can interfere with prune's strategy of using mtimes to mark files in progress. For example, if a branch contains a particular tree object and is deleted, that tree object may become unreachable, and have an old mtime. If a new operation then tries to write the same tree, this ends up as a noop; we notice we already have the object and do nothing. A prune running simultaneously with this operation will see the object as old, and may delete it. We can solve this by "freshening" objects that we avoid writing by updating their mtime. The algorithm for doing so is essentially the same as that of has_sha1_file. Therefore we provide a new (static) interface "check_and_freshen", which finds and optionally freshens the object. It's trivial to implement freshening and simple checking by tweaking a single parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:43 -07:00
Jeff King	abcb86553d	pack-objects: match prune logic for discarding objects A recent commit taught git-prune to keep non-recent objects that are reachable from recent ones. However, pack-objects, when loosening unreachable objects, tries to optimize out the write in the case that the object will be immediately pruned. It now gets this wrong, since its rule does not reflect the new prune code (and this can be seen by running t6501 with a strategically placed repack). Let's teach pack-objects similar logic. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:43 -07:00
Jeff King	d0d46abc16	pack-objects: refactor unpack-unreachable expiration check When we are loosening unreachable packed objects, we do not bother to process objects that would simply be pruned immediately anyway. The "would be pruned" check is a simple comparison, but is about to get more complicated. Let's pull it out into a separate function. Note that this is slightly less efficient than the original, which avoided even opening old packs, since no object in them could pass the current check, which cares only about the pack mtime. But the new rules will depend on the exact object, so we need to perform the check even for old packs. Note also that we fix a minor buglet when the pack mtime is exactly the same as the expiration time. The prune code considers that worth pruning, whereas our check here considered it worth keeping. This wasn't a big deal. Besides being unlikely to happen, the result was simply that the object was loosened and then pruned, missing the optimization. Still, we can easily fix it while we are here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:42 -07:00
Jeff King	d3038d22f9	prune: keep objects reachable from recent objects Our current strategy with prune is that an object falls into one of three categories: 1. Reachable (from ref tips, reflogs, index, etc). 2. Not reachable, but recent (based on the --expire time). 3. Not reachable and not recent. We keep objects from (1) and (2), but prune objects in (3). The point of (2) is that these objects may be part of an in-progress operation that has not yet updated any refs. However, it is not always the case that objects for an in-progress operation will have a recent mtime. For example, the object database may have an old copy of a blob (from an abandoned operation, a branch that was deleted, etc). If we create a new tree that points to it, a simultaneous prune will leave our tree, but delete the blob. Referencing that tree with a commit will then work (we check that the tree is in the object database, but not that all of its referred objects are), as will mentioning the commit in a ref. But the resulting repo is corrupt; we are missing the blob reachable from a ref. One way to solve this is to be more thorough when referencing a sha1: make sure that not only do we have that sha1, but that we have objects it refers to, and so forth recursively. The problem is that this is very expensive. Creating a parent link would require traversing the entire object graph! Instead, this patch pushes the extra work onto prune, which runs less frequently (and has to look at the whole object graph anyway). It creates a new category of objects: objects which are not recent, but which are reachable from a recent object. We do not prune these objects, just like the reachable and recent ones. This lets us avoid the recursive check above, because if we have an object, even if it is unreachable, we should have its referent. We can make a simple inductive argument that with this patch, this property holds (that there are no objects with missing referents in the repository): 0. When we have no objects, we have nothing to refer or be referred to, so the property holds. 1. If we add objects to the repository, their direct referents must generally exist (e.g., if you create a tree, the blobs it references must exist; if you create a commit to point at the tree, the tree must exist). This is already the case before this patch. And it is not 100% foolproof (you can make bogus objects using `git hash-object`, for example), but it should be the case for normal usage. Therefore for any sequence of object additions, the property will continue to hold. 2. If we remove objects from the repository, then we will not remove a child object (like a blob) if an object that refers to it is being kept. That is the part implemented by this patch. Note, however, that our reachability check and the actual pruning are not atomic. So it _is_ still possible to violate the property (e.g., an object becomes referenced just as we are deleting it). This patch is shooting for eliminating problems where the mtimes of dependent objects differ by hours or days, and one is dropped without the other. It does nothing to help with short races. Naively, the simplest way to implement this would be to add all recent objects as tips to the reachability traversal. However, this does not perform well. In a recently-packed repository, all reachable objects will also be recent, and therefore we have to look at each object twice. This patch instead performs the reachability traversal, then follows up with a second traversal for recent objects, skipping any that have already been marked. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:42 -07:00
Jeff King	660c889e46	sha1_file: add for_each iterators for loose and packed objects We typically iterate over the reachable objects in a repository by starting at the tips and walking the graph. There's no easy way to iterate over all of the objects, including unreachable ones. Let's provide a way of doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:41 -07:00
Jeff King	4a1e693a30	count-objects: use for_each_loose_file_in_objdir This drops our line count considerably, and should make things more readable by keeping the counting logic separate from the traversal. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:41 -07:00
Jeff King	cac05d4dfd	count-objects: do not use xsize_t when counting object size The point of xsize_t is to safely cast an off_t into a size_t (because we are about to mmap). But in count-objects, we are summing the sizes in an off_t. Using xsize_t means that count-objects could fail on a 32-bit system with a 4G object (not likely, as other parts of git would fail, but we should at least be correct here). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:41 -07:00

1 2 3 4 5 ...

38078 Commits