git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	d70e331c0e	Merge branch 'jk/prune-mtime' Tighten the logic to decide that an unreachable cruft is sufficiently old by covering corner cases such as an ancient object becoming reachable and then going unreachable again, in which case its retention period should be prolonged. * jk/prune-mtime: (28 commits) drop add_object_array_with_mode revision: remove definition of unused 'add_object' function pack-objects: double-check options before discarding objects repack: pack objects mentioned by the index pack-objects: use argv_array reachable: use revision machinery's --indexed-objects code rev-list: add --indexed-objects option rev-list: document --reflog option t5516: test pushing a tag of an otherwise unreferenced blob traverse_commit_list: support pending blobs/trees with paths make add_object_array_with_context interface more sane write_sha1_file: freshen existing objects pack-objects: match prune logic for discarding objects pack-objects: refactor unpack-unreachable expiration check prune: keep objects reachable from recent objects sha1_file: add for_each iterators for loose and packed objects count-objects: use for_each_loose_file_in_objdir count-objects: do not use xsize_t when counting object size prune-packed: use for_each_loose_file_in_objdir reachable: mark index blobs as SEEN ...	2014-10-29 10:07:56 -07:00
Jeff King	660c889e46	sha1_file: add for_each iterators for loose and packed objects We typically iterate over the reachable objects in a repository by starting at the tips and walking the graph. There's no easy way to iterate over all of the objects, including unreachable ones. Let's provide a way of doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:41 -07:00
Jeff King	27e1e22d5e	prune: factor out loose-object directory traversal Prune has to walk $GIT_DIR/objects/?? in order to find the set of loose objects to prune. Other parts of the code (e.g., count-objects) want to do the same. Let's factor it out into a reusable for_each-style function. Note that this is not quite a straight code movement. The original code had strange behavior when it found a file of the form "[0-9a-f]{2}/.{38}" that did _not_ contain all hex digits. It executed a "break" from the loop, meaning that we stopped pruning in that directory (but still pruned other directories!). This was probably a bug; we do not want to process the file as an object, but we should keep going otherwise (and that is how the new code handles it). We are also a little more careful with loose object directories which fail to open. The original code silently ignored any failures, but the new code will complain about any problems besides ENOENT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:39 -07:00
Jeff King	fe1b22686f	foreach_alt_odb: propagate return value from callback We check the return value of the callback and stop iterating if it is non-zero. However, we do not make the non-zero return value available to the caller, so they have no way of knowing whether the operation succeeded or not (technically they can keep their own error flag in the callback data, but that is unlike our other for_each functions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:35 -07:00
Ronnie Sahlberg	d0f810f0bc	refs.c: allow listing and deleting badly named refs We currently do not handle badly named refs well: $ cp .git/refs/heads/master .git/refs/heads/master.....@\@\\. $ git branch fatal: Reference has invalid format: 'refs/heads/master.....@@\.' $ git branch -D master.....@\@\\. error: branch 'master.....@@\.' not found. Users cannot recover from a badly named ref without manually finding and deleting the loose ref file or appropriate line in packed-refs. Making that easier will make it easier to tweak the ref naming rules in the future, for example to forbid shell metacharacters like '`' and '"', without putting people in a state that is hard to get out of. So allow "branch --list" to show these refs and allow "branch -d/-D" and "update-ref -d" to delete them. Other commands (for example to rename refs) will continue to not handle these refs but can be changed in later patches. Details: In resolving functions, refuse to resolve refs that don't pass the git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to resolve refs that escape the refs/ directory and do not match the pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD"). In locking functions, refuse to act on badly named refs unless they are being deleted and either are in the refs/ directory or match [A-Z_]. Just like other invalid refs, flag resolved, badly named refs with the REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them in all iteration functions except for for_each_rawref. Flag badly named refs (but not symrefs pointing to badly named refs) with a REF_BAD_NAME flag to make it easier for future callers to notice and handle them specially. For example, in a later patch for-each-ref will use this flag to detect refs whose names can confuse callers parsing for-each-ref output. In the transaction API, refuse to create or update badly named refs, but allow deleting them (unless they try to escape refs/ and don't match [A-Z_]). Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-15 10:47:26 -07:00
Jonathan Nieder	62a2d52514	branch -d: avoid repeated symref resolution If a repository gets in a broken state with too much symref nesting, it cannot be repaired with "git branch -d": $ git symbolic-ref refs/heads/nonsense refs/heads/nonsense $ git branch -d nonsense error: branch 'nonsense' not found. Worse, "git update-ref --no-deref -d" doesn't work for such repairs either: $ git update-ref -d refs/heads/nonsense error: unable to resolve reference refs/heads/nonsense: Too many levels of symbolic links Fix both by teaching resolve_ref_unsafe a new RESOLVE_REF_NO_RECURSE flag and passing it when appropriate. Callers can still read the value of a symref (for example to print a message about it) with that flag set --- resolve_ref_unsafe will resolve one level of symrefs and stop there. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-15 10:47:25 -07:00
Ronnie Sahlberg	7695d118e5	refs.c: change resolve_ref_unsafe reading argument to be a flags field resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref resolves successfully for writing but not for reading). Change this to be a flags field instead, and pass the new constant RESOLVE_REF_READING when we want this behaviour. While at it, swap two of the arguments in the function to put output arguments at the end. As a nice side effect, this ensures that we can catch callers that were unaware of the new API so they can be audited. Give the wrapper functions resolve_refdup and read_ref_full the same treatment for consistency. Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-15 10:47:24 -07:00
Junio C Hamano	bd107e1052	Merge branch 'mh/lockfile' The lockfile API and its users have been cleaned up. * mh/lockfile: (38 commits) lockfile.h: extract new header file for the functions in lockfile.c hold_locked_index(): move from lockfile.c to read-cache.c hold_lock_file_for_append(): restore errno before returning get_locked_file_path(): new function lockfile.c: rename static functions lockfile: rename LOCK_NODEREF to LOCK_NO_DEREF commit_lock_file_to(): refactor a helper out of commit_lock_file() trim_last_path_component(): replace last_path_elm() resolve_symlink(): take a strbuf parameter resolve_symlink(): use a strbuf for internal scratch space lockfile: change lock_file::filename into a strbuf commit_lock_file(): use a strbuf to manage temporary space try_merge_strategy(): use a statically-allocated lock_file object try_merge_strategy(): remove redundant lock_file allocation struct lock_file: declare some fields volatile lockfile: avoid transitory invalid states git_config_set_multivar_in_file(): avoid call to rollback_lock_file() dump_marks(): remove a redundant call to rollback_lock_file() api-lockfile: document edge cases commit_lock_file(): rollback lock file on failure to rename ...	2014-10-14 10:49:45 -07:00
Junio C Hamano	f0d8900175	Merge branch 'sp/stream-clean-filter' When running a required clean filter, we do not have to mmap the original before feeding the filter. Instead, stream the file contents directly to the filter and process its output. * sp/stream-clean-filter: sha1_file: don't convert off_t to size_t too early to avoid potential die() convert: stream from fd to required clean filter to reduce used address space copy_fd(): do not close the input file descriptor mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap size memory_limit: use git_env_ulong() to parse GIT_ALLOC_LIMIT config.c: add git_env_ulong() to parse environment variable convert: drop arguments other than 'path' from would_convert_to_git()	2014-10-08 13:05:32 -07:00
Michael Haggerty	697cc8efd9	lockfile.h: extract new header file for the functions in lockfile.c Move the interface declaration for the functions in lockfile.c from cache.h to a new file, lockfile.h. Add #includes where necessary (and remove some redundant includes of cache.h by files that already include builtin.h). Move the documentation of the lock_file state diagram from lockfile.c to the new header file. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:56:14 -07:00
Michael Haggerty	ec38b4e482	get_locked_file_path(): new function Add a function to return the path of the file that is locked by a lock_file object. This reduces the knowledge that callers have to have about the lock_file layout. Suggested-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:53:54 -07:00
Michael Haggerty	47ba4662bf	lockfile: rename LOCK_NODEREF to LOCK_NO_DEREF This makes it harder to misread the name as LOCK_NODE_REF. Suggested-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:53:28 -07:00
Michael Haggerty	751bacedaa	commit_lock_file_to(): refactor a helper out of commit_lock_file() commit_locked_index(), when writing to an alternate index file, duplicates (poorly) the code in commit_lock_file(). And anyway, it shouldn't have to know so much about the internal workings of lockfile objects. So extract a new function commit_lock_file_to() that does the work common to the two functions, and call it from both commit_lock_file() and commit_locked_index(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:52:06 -07:00
Michael Haggerty	cf6950d3bf	lockfile: change lock_file::filename into a strbuf For now, we still make sure to allocate at least PATH_MAX characters for the strbuf because resolve_symlink() doesn't know how to expand the space for its return value. (That will be fixed in a moment.) Another alternative would be to just use a strbuf as scratch space in lock_file() but then store a pointer to the naked string in struct lock_file. But lock_file objects are often reused. By reusing the same strbuf, we can avoid having to reallocate the string most times when a lock_file object is reused. Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:50:01 -07:00
Michael Haggerty	2091c5062c	struct lock_file: declare some fields volatile The function remove_lock_file_on_signal() is used as a signal handler. It is not realistic to make the signal handler conform strictly to the C standard, which is very restrictive about what a signal handler is allowed to do. But let's increase the likelihood that it will work: The lock_file_list global variable and several fields from struct lock_file are used by the signal handler. Declare those values "volatile" to (1) force the main process to write the values to RAM promptly, and (2) prevent updates to these fields from being reordered in a way that leaves an opportunity for a jump to the signal handler while the object is in an inconsistent state. We don't mark the filename field volatile because that would prevent the use of strcpy(), and it is anyway unlikely that a compiler re-orders a strcpy() call across other expressions. So in practice it should be possible to get away without "volatile" in the "filename" case. Suggested-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:49:00 -07:00
Michael Haggerty	707103fdfd	lockfile: avoid transitory invalid states Because remove_lock_file() can be called any time by the signal handler, it is important that any lock_file objects that are in the lock_file_list are always in a valid state. And since lock_file objects are often reused (but are never removed from lock_file_list), that means we have to be careful whenever mutating a lock_file object to always keep it in a well-defined state. This was formerly not the case, because part of the state was encoded by setting lk->filename to the empty string vs. a valid filename. It is wrong to assume that this string can be updated atomically; for example, even strcpy(lk->filename, value) is unsafe. But the old code was even more reckless; for example, strcpy(lk->filename, path); if (!(flags & LOCK_NODEREF)) resolve_symlink(lk->filename, max_path_len); strcat(lk->filename, ".lock"); During the call to resolve_symlink(), lk->filename contained the name of the file that was being locked, not the name of the lockfile. If a signal were raised during that interval, then the signal handler would have deleted the valuable file! We could probably continue to use the filename field to encode the state by being careful to write characters 1..N-1 of the filename first, and then overwrite the NUL at filename[0] with the first character of the filename, but that would be awkward and error-prone. So, instead of using the filename field to determine whether the lock_file object is active, add a new field "lock_file::active" for this purpose. Be careful to set this field only when filename really contains the name of a file that should be deleted on cleanup. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:48:59 -07:00
Michael Haggerty	7108ad232f	cache.h: define constants LOCK_SUFFIX and LOCK_SUFFIX_LEN There are a few places that use these values, so define constants for them. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:45:11 -07:00
Michael Haggerty	e197c21807	unable_to_lock_die(): rename function from unable_to_lock_index_die() This function is used for other things besides the index, so rename it accordingly. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Ronnie Sahlberg <sahlberg@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:38:38 -07:00
Junio C Hamano	102edda4df	Merge branch 'ta/config-add-to-empty-or-true-fix' into maint "git config --add section.var val" used to lose existing section.var whose value was an empty string. * ta/config-add-to-empty-or-true-fix: config: avoid a funny sentinel value "a^" make config --add behave correctly for empty and NULL values	2014-09-29 22:10:25 -07:00
Junio C Hamano	1c2ea2cdc0	Merge branch 'rs/realloc-array' Code cleanup. * rs/realloc-array: use REALLOC_ARRAY for changing the allocation size of arrays add macro REALLOC_ARRAY	2014-09-26 14:39:45 -07:00
Junio C Hamano	69a5bbbbfa	Merge branch 'jk/write-packed-refs-via-stdio' Optimize the code path to write out the packed-refs file, which especially matters in a repository with a large number of refs. * jk/write-packed-refs-via-stdio: refs: write packed_refs file using stdio	2014-09-26 14:39:42 -07:00
Junio C Hamano	4daf5c8643	Merge branch 'ta/config-add-to-empty-or-true-fix' "git config --add section.var val" used to lose existing section.var whose value was an empty string. * ta/config-add-to-empty-or-true-fix: config: avoid a funny sentinel value "a^" make config --add behave correctly for empty and NULL values	2014-09-19 11:38:40 -07:00
Junio C Hamano	9ff700ebac	Merge branch 'jk/commit-author-parsing' Code clean-up. * jk/commit-author-parsing: determine_author_info(): copy getenv output determine_author_info(): reuse parsing functions date: use strbufs in date-formatting functions record_author_date(): use find_commit_header() record_author_date(): fix memory leak on malformed commit commit: provide a function to find a header in a buffer	2014-09-19 11:38:33 -07:00
Junio C Hamano	ceeacc501b	Merge branch 'bb/date-iso-strict' "log --date=iso" uses a slight variant of ISO 8601 format that is made more human readable. A new "--date=iso-strict" option gives datetime output that is more strictly conformant. * bb/date-iso-strict: pretty: provide a strict ISO 8601 date format	2014-09-19 11:38:32 -07:00
René Scharfe	2756ca4347	use REALLOC_ARRAY for changing the allocation size of arrays Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-09-18 09:13:42 -07:00
Jeff King	c1063be2a3	config: avoid a funny sentinel value "a^" Introduce CONFIG_REGEX_NONE as a more explicit sentinel value to say "we do not want to replace any existing entry" and use it in the implementation of "git config --add". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-09-11 16:33:54 -07:00
Junio C Hamano	7f346e9d73	Merge branch 'ta/config-set-1' Use the new caching config-set API in git_config() calls. * ta/config-set-1: add tests for `git_config_get_string_const()` add a test for semantic errors in config files rewrite git_config() to use the config-set API config: add `git_die_config()` to the config-set API change `git_config()` return value to void add line number and file name info to `config_set` config.c: fix accuracy of line number in errors config.c: mark error and warnings strings for translation	2014-09-11 10:33:25 -07:00
Jeff King	9540ce5030	refs: write packed_refs file using stdio We write each line of a new packed-refs file individually using a write() syscall (and sometimes 2, if the ref is peeled). Since each line is only about 50-100 bytes long, this creates a lot of system call overhead. We can instead open a stdio handle around our descriptor and use fprintf to write to it. The extra buffering is not a problem for us, because nobody will read our new packed-refs file until we call commit_lock_file (by which point we have flushed everything). On a pathological repository with 8.5 million refs, this dropped the time to run `git pack-refs` from 20s to 6s. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-09-10 10:58:32 -07:00
Junio C Hamano	56f214e071	Merge branch 'ta/config-set' Add in-core caching layer to let us avoid reading the same configuration files number of times. * ta/config-set: test-config: add tests for the config_set API add `config_set` API for caching config-like files	2014-09-02 13:24:18 -07:00
Junio C Hamano	1d8a6f6929	Merge branch 'mm/config-edit-global' Start "git config --edit --global" from a skeletal per-user configuration file contents, instead of a total blank, when the user does not already have any. This immediately reduces the need for a later "Have you forgotten setting core.user?" and we can add more to the template as we gain more experience. * mm/config-edit-global: commit: advertise config --global --edit on guessed identity home_config_paths(): let the caller ignore xdg path config --global --edit: create a template file if needed	2014-09-02 13:23:20 -07:00
Junio C Hamano	c518279c0e	Merge branch 'jc/reopen-lock-file' There are cases where you lock and open to write a file, close it to show the updated contents to external processes, and then have to update the file again while still holding the lock, but the lockfile API lacked support for such an access pattern. * jc/reopen-lock-file: lockfile: allow reopening a closed but still locked file	2014-09-02 13:20:13 -07:00
Beat Bolli	466fb6742d	pretty: provide a strict ISO 8601 date format Git's "ISO" date format does not really conform to the ISO 8601 standard due to small differences, and it cannot be parsed by ISO 8601-only parsers, e.g. those of XML toolchains. The output from "--date=iso" deviates from ISO 8601 in these ways: - a space instead of the `T` date/time delimiter - a space between time and time zone - no colon between hours and minutes of the time zone Add a strict ISO 8601 date format for displaying committer and author dates. Use the '%aI' and '%cI' format specifiers and add '--date=iso-strict' or '--date=iso8601-strict' date format names. See http://thread.gmane.org/gmane.comp.version-control.git/255879 and http://thread.gmane.org/gmane.comp.version-control.git/52414/focus=52585 for discussion. Signed-off-by: Beat Bolli <bbolli@ewanet.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-29 12:37:02 -07:00
Steffen Prohaska	23b0c4782e	config.c: add git_env_ulong() to parse environment variable The new function parses an integeral value that fits in unsigned long in human readable form, i.e. possibly with unit suffix, e.g. 10k = 10240, etc., from an environment variable. Parsing of GIT_MMAP_LIMIT and GIT_ALLOC_LIMIT will use it in later patches. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-28 10:24:54 -07:00
Jeff King	c33ddc2e33	date: use strbufs in date-formatting functions Many of the date functions write into fixed-size buffers. This is a minor pain, as we have to take special precautions, and frequently end up copying the result into a strbuf or heap-allocated buffer anyway (for which we sometimes use strcpy!). Let's instead teach parse_date, datestamp, etc to write to a strbuf. The obvious downside is that we might need to perform a heap allocation where we otherwise would not need to. However, it turns out that the only two new allocations required are: 1. In test-date.c, where we don't care about efficiency. 2. In determine_author_info, which is not performance critical (and where the use of a strbuf will help later refactoring). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-27 10:32:56 -07:00
Tanay Abhra	155ef25f12	rewrite git_config() to use the config-set API Of all the functions in `git_config*()` family, `git_config()` has the most invocations in the whole code base. Each `git_config()` invocation causes config file rereads which can be avoided using the config-set API. Use the config-set API to rewrite `git_config()` to use the config caching layer to avoid config file rereads on each invocation during a git process lifetime. First invocation constructs the cache, and after that for each successive invocation, `git_config()` feeds values from the config cache instead of rereading the configuration files. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-07 11:41:10 -07:00
Tanay Abhra	5a80e97c82	config: add `git_die_config()` to the config-set API Add `git_die_config` that dies printing the line number and the file name of the highest priority value for the configuration variable `key`. A custom error message is also printed before dying, specified by the caller, which can be skipped if `err` argument is set to NULL. It has usage in non-callback based config value retrieval where we can raise an error and die if there is a semantic error. For example, if (!git_config_get_value(key, &value)){ if (!strcmp(value, "foo")) git_config_die(key, "value: `%s` is illegal", value); else /* do work */ } Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-07 11:40:25 -07:00
Tanay Abhra	aace438502	change `git_config()` return value to void Currently `git_config()` returns an integer signifying an error code. During rewrites of the function most of the code was shifted to `git_config_with_options()`. `git_config_with_options()` normally returns positive values if its `config_source` parameter is set as NULL, as most errors are fatal, and non-fatal potential errors are guarded by "if" statements that are entered only when no error is possible. Still a negative value can be returned in case of race condition between `access_or_die()` & `git_config_from_file()`. Also, all callers of `git_config()` ignore the return value except for one case in branch.c. Change `git_config()` return value to void and make it die if it receives a negative value from `git_config_with_options()`. Original-patch-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-07 11:40:17 -07:00
Tanay Abhra	3df8fd625f	add line number and file name info to `config_set` Store file name and line number for each key-value pair in the cache during parsing of the configuration files. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-07 11:38:50 -07:00
Tanay Abhra	3c8687a73e	add `config_set` API for caching config-like files Currently `git_config()` uses a callback mechanism and file rereads for config values. Due to this approach, it is not uncommon for the config files to be parsed several times during the run of a git program, with different callbacks picking out different variables useful to themselves. Add a `config_set`, that can be used to construct an in-memory cache for config-like files that the caller specifies (i.e., files like `.gitmodules`, `~/.gitconfig` etc.). Add two external functions `git_configset_get_value` and `git_configset_get_value_multi` for querying from the config sets. `git_configset_get_value` follows `last one wins` semantic (i.e. if there are multiple matches for the queried key in the files of the configset the value returned will be the last entry in `value_list`). `git_configset_get_value_multi` returns a list of values sorted in order of increasing priority (i.e. last match will be at the end of the list). Add type specific query functions like `git_configset_get_bool` and similar. Add a default `config_set`, `the_config_set` to cache all key-value pairs read from usual config files (repo specific .git/config, user wide ~/.gitconfig, XDG config and the global /etc/gitconfig). `the_config_set` is populated using `git_config()`. Add two external functions `git_config_get_value` and `git_config_get_value_multi` for querying in a non-callback manner from `the_config_set`. Also, add type specific query functions that are implemented as a thin wrapper around the `config_set` API. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-29 14:29:56 -07:00
Jeff King	5de7f500c1	alloc: factor out commit index We keep a static counter to set the commit index on newly allocated objects. However, since we also need to set the index on any_objects which are converted to commits, let's make the counter available as a public function. While we're moving it, let's make sure the counter is allocated as an unsigned integer to match the index field in "struct commit". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-28 10:14:33 -07:00
Matthieu Moy	9830534e40	config --global --edit: create a template file if needed When the user has no ~/.gitconfig file, git config --global --edit used to launch an editor on an nonexistant file name. Instead, create a file with a default content before launching the editor. The template contains only commented-out entries, to save a few keystrokes for the user. If the values are guessed properly, the user will only have to uncomment the entries. Advanced users teaching newbies can create a minimalistic configuration faster for newbies. Beginners reading a tutorial advising to run "git config --global --edit" as a first step will be slightly more guided for their first contact with Git. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-25 12:23:06 -07:00
Junio C Hamano	10b944b37b	Merge branch 'jk/alloc-commit-id' Make sure all in-core commit objects are assigned a unique number so that they can be annotated using the commit-slab API. * jk/alloc-commit-id: diff-tree: avoid lookup_unknown_object object_as_type: set commit index alloc: factor out commit index add object_as_type helper for casting objects parse_object_buffer: do not set object type move setting of object->type to alloc_* functions alloc: write out allocator definitions alloc.c: remove the alloc_raw_commit_node() function	2014-07-22 10:59:25 -07:00
Junio C Hamano	9f2de9c121	Merge branch 'kb/perf-trace' * kb/perf-trace: api-trace.txt: add trace API documentation progress: simplify performance measurement by using getnanotime() wt-status: simplify performance measurement by using getnanotime() git: add performance tracing for git's main() function to debug scripts trace: add trace_performance facility to debug performance issues trace: add high resolution timer function to debug performance issues trace: add 'file:line' to all trace output trace: move code around, in preparation to file:line output trace: add current timestamp to all trace output trace: disable additional trace output for unit tests trace: add infrastructure to augment trace output with additional info sha1_file: change GIT_TRACE_PACK_ACCESS logging to use trace API Documentation/git.txt: improve documentation of 'GIT_TRACE*' variables trace: improve trace performance trace: remove redundant printf format attribute trace: consistently name the format parameter trace: move trace declarations from cache.h to new trace.h	2014-07-22 10:59:19 -07:00
Junio C Hamano	19a249ba83	Merge branch 'rs/ref-transaction-0' Early part of the "ref transaction" topic. * rs/ref-transaction-0: refs.c: change ref_transaction_update() to do error checking and return status refs.c: remove the onerr argument to ref_transaction_commit update-ref: use err argument to get error from ref_transaction_commit refs.c: make update_ref_write update a strbuf on failure refs.c: make ref_update_reject_duplicates take a strbuf argument for errors refs.c: log_ref_write should try to return meaningful errno refs.c: make resolve_ref_unsafe set errno to something meaningful on error refs.c: commit_packed_refs to return a meaningful errno on failure refs.c: make remove_empty_directories always set errno to something sane refs.c: verify_lock should set errno to something meaningful refs.c: make sure log_ref_setup returns a meaningful errno refs.c: add an err argument to repack_without_refs lockfile.c: make lock_file return a meaningful errno on failurei lockfile.c: add a new public function unable_to_lock_message refs.c: add a strbuf argument to ref_transaction_commit for error logging refs.c: allow passing NULL to ref_transaction_free refs.c: constify the sha arguments for ref_transaction_create\|delete\|update refs.c: ref_transaction_commit should not free the transaction refs.c: remove ref_transaction_rollback	2014-07-21 11:18:37 -07:00
Junio C Hamano	c9831bb09d	Merge branch 'kb/path-max-must-go' * kb/path-max-must-go: cache.h: rename cache_def_free to cache_def_clear	2014-07-16 11:32:33 -07:00
Junio C Hamano	788cef81d4	Merge branch 'nd/split-index' An experiment to use two files (the base file and incremental changes relative to it) to represent the index to reduce I/O cost of rewriting a large index when only small part of the working tree changes. * nd/split-index: (32 commits) t1700: new tests for split-index mode t2104: make sure split index mode is off for the version test read-cache: force split index mode with GIT_TEST_SPLIT_INDEX read-tree: note about dropping split-index mode or index version read-tree: force split-index mode off on --index-output rev-parse: add --shared-index-path to get shared index path update-index --split-index: do not split if $GIT_DIR is read only update-index: new options to enable/disable split index mode split-index: strip pathname of on-disk replaced entries split-index: do not invalidate cache-tree at read time split-index: the reading part split-index: the writing part read-cache: mark updated entries for split index read-cache: save deleted entries in split index read-cache: mark new entries for split index read-cache: split-index mode read-cache: save index SHA-1 after reading entry.c: update cache_changed if refresh_cache is set in checkout_entry() cache-tree: mark istate->cache_changed on prime_cache_tree() cache-tree: mark istate->cache_changed on cache tree update ...	2014-07-16 11:25:40 -07:00
Junio C Hamano	93dcaea226	lockfile: allow reopening a closed but still locked file In some code paths (e.g. giving "add -i" to prepare the contents to be committed interactively inside "commit -p") where a caller takes a lock, writes the new content, give chance for others to use it while still holding the lock, and then releases the lock when all is done. As an extension, allow the caller to re-update an already closed file while still holding the lock (i.e. not yet committed) by re-opening the file, to be followed by updating the contents and then by the usual close_lock_file() or commit_lock_file(). This is necessary if we want to add code to rebuild the cache-tree and write the resulting index out after "add -i" returns the control to "commit -p", for example. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-14 13:05:37 -07:00
Ronnie Sahlberg	76d70dc0c6	refs.c: make resolve_ref_unsafe set errno to something meaningful on error Making errno when returning from resolve_ref_unsafe() meaningful, which should fix * a bug in lock_ref_sha1_basic, where it assumes EISDIR means it failed due to a directory being in the way Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Michael Haggerty <mhagger@alum.mit.edu>	2014-07-14 11:54:42 -07:00
Ronnie Sahlberg	6af926e8bc	lockfile.c: add a new public function unable_to_lock_message Introducing a new unable_to_lock_message helper, which has nicer semantics than unable_to_lock_error and cleans up lockfile.c a little. Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Michael Haggerty <mhagger@alum.mit.edu>	2014-07-14 11:54:40 -07:00
Jeff King	94d5a22cf6	alloc: factor out commit index We keep a static counter to set the commit index on newly allocated objects. However, since we also need to set the index on any_objects which are converted to commits, let's make the counter available as a public function. While we're moving it, let's make sure the counter is allocated as an unsigned integer to match the index field in "struct commit". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-13 18:59:05 -07:00
Karsten Blees	2a60839150	cache.h: rename cache_def_free to cache_def_clear Rename cache_def_free to cache_def_clear as it doesn't free the struct cache_def, but just clears its content. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-13 10:12:37 -07:00
Junio C Hamano	11def366e5	Merge branch 'kb/path-max-must-go' * kb/path-max-must-go: symlinks: remove PATH_MAX limitation	2014-07-10 11:27:47 -07:00
Karsten Blees	e7c7305300	symlinks: remove PATH_MAX limitation 'git checkout' fails if a directory is longer than PATH_MAX, because the lstat_cache in symlinks.c checks if the leading directory exists using PATH_MAX-bounded string operations. Remove the limitation by using strbuf instead. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-07 11:22:42 -07:00
Junio C Hamano	1881d2b88c	Merge branch 'ym/fix-opportunistic-index-update-race' into maint "git status", even though it is a read-only operation, tries to update the index with refreshed lstat(2) info to optimize future accesses to the working tree opportunistically, but this could race with a "read-write" operation that modify the index while it is running. Detect such a race and avoid overwriting the index. * ym/fix-opportunistic-index-update-race: read-cache.c: verify index file before we opportunistically update it wrapper.c: add xpread() similar to xread()	2014-06-25 11:49:48 -07:00
Jeremiah Mahler	ccdd4a0f3c	cleanup duplicate name_compare() functions We often represent our strings as a counted string, i.e. a pair of the pointer to the beginning of the string and its length, and the string may not be NUL terminated to that length. To compare a pair of such counted strings, unpack-trees.c and read-cache.c implement their own name_compare() functions identically. In addition, the cache_name_compare() function in read-cache.c is nearly identical. The only difference is when one string is the prefix of the other string, in which case name_compare() returns -1/+1 to show which one is longer, and cache_name_compare() returns the difference of the lengths to show the same information. Unify these three functions by using the implementation from cache_name_compare(). This does not make any difference to the existing and future callers, as they must be paying attention only to the sign of the returned value (and not the magnitude) because the original implementations of these two functions return values returned by memcmp(3) when the one string is not a prefix of the other string, and the only thing memcmp(3) guarantees its callers is the sign of the returned value, not the magnitude. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:12:14 -07:00
Karsten Blees	5991a55c54	trace: move trace declarations from cache.h to new trace.h Also include direct dependencies (strbuf.h and git-compat-util.h for __attribute__) so that trace.h can be used independently of cache.h, e.g. in test programs. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-17 09:37:47 -07:00
Junio C Hamano	b83163643b	Merge branch 'sk/windows-unc-path' * sk/windows-unc-path: Windows: allow using UNC path for git repository	2014-06-16 10:07:03 -07:00
Nguyễn Thái Ngọc Duy	3e52f70b15	t1700: new tests for split-index mode Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:42 -07:00
Nguyễn Thái Ngọc Duy	c18b80a0e8	update-index: new options to enable/disable split index mode If you have a large work tree but only make changes in a subset, then $GIT_DIR/index's size should be stable after a while. If you change branches that touch something else, $GIT_DIR/index's size may grow large that it becomes as slow as the unified index. Do --split-index again occasionally to force all changes back to the shared index and keep $GIT_DIR/index small. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:41 -07:00
Nguyễn Thái Ngọc Duy	b3c96fb158	split-index: strip pathname of on-disk replaced entries We know the positions of replaced entries via the replace bitmap in "link" extension, so the "name" path does not have to be stored (it's still in the shared index). With this, we also have a way to distinguish additions vs replacements at load time and can catch broken "link" extensions. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:41 -07:00
Nguyễn Thái Ngọc Duy	ce7c614bce	split-index: do not invalidate cache-tree at read time We are sure that after merge_base_index() is done. cache-tree can still be used with the final index. So don't destroy cache tree. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:41 -07:00
Nguyễn Thái Ngọc Duy	078a58e825	read-cache: mark updated entries for split index The large part of this patch just follows CE_ENTRY_CHANGED marks. replace_index_entry() is updated to update split_index->base->cache[] as well so base->cache[] does not reference to a freed entry. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:40 -07:00
Nguyễn Thái Ngọc Duy	5fc2fc8fa2	read-cache: split-index mode This split-index mode is designed to keep write cost proportional to the number of changes the user has made, not the size of the work tree. (Read cost is another matter, to be dealt separately.) This mode stores index info in a pair of $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over time while "index" is smaller and updated often. Format details are in index-format.txt, although not everything is implemented in this patch. Shared indexes are not automatically removed, because it's unclear if the shared index is needed by any (even temporary) indexes by just looking at it. After a while you'll collect stale shared indexes. The good news is one shared index is useable for long, until $GIT_DIR/index becomes too big and sluggish that the new shared index must be created. The safest way to clean shared indexes is to turn off split index mode, so shared files are all garbage, delete them all, then turn on split index mode again. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy	e93021b20a	read-cache: save index SHA-1 after reading Also update SHA-1 after writing. If we do not do that, the second read_index() will see "initialized" variable already set and not read .git/index again, which is fine, except istate->sha1 now has a stale value. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy	d4a2024aef	entry.c: update cache_changed if refresh_cache is set in checkout_entry() Other fill_stat_cache_info() is on new entries, which should set CE_ENTRY_ADDED in cache_changed, so we're safe. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy	a5400efe29	cache-tree: mark istate->cache_changed on cache tree invalidation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy	6c306a34ee	resolve-undo: be specific what part of the index has changed Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:38 -07:00
Nguyễn Thái Ngọc Duy	e636a7b4d0	read-cache: be specific what part of the index has changed cache entry additions, removals and modifications are separated out. The rest of changes are still in the catch-all flag SOMETHING_CHANGED, which would be more specific later. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:38 -07:00
Nguyễn Thái Ngọc Duy	ce51bf09f8	read-cache: store in-memory flags in the first 12 bits of ce_flags We're running out of room for in-memory flags. But since `b60e188` (Strip namelen out of ce_flags into a ce_namelen field - 2012-07-11), we copy the namelen (first 12 bits) to ce_namelen field. So those bits are free to use. Just make sure we do not accidentally write any in-memory flags back. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:38 -07:00
Nguyễn Thái Ngọc Duy	626f35c893	read-cache: relocate and unexport commit_locked_index() This function is now only used by write_locked_index(). Move it to read-cache.c (because read-cache.c will need to be aware of alternate_index_output later) and unexport it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:38 -07:00
Nguyễn Thái Ngọc Duy	03b8664772	read-cache: new API write_locked_index instead of write_index/write_cache Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:10 -07:00
Cezary Zawadka	c2369bdf7f	Windows: allow using UNC path for git repository [efl: moved MinGW-specific part to compat/] [jes: fixed compilation on non-Windows] Eric Sunshine fixed mingw_offset_1st_component() to return consistently "foo" for UNC "//machine/share/foo", cf http://groups.google.com/group/msysgit/browse_thread/thread/c0af578549b5dda0 Author: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Cezary Zawadka <czawadka@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Stepan Kasal <kasal@ucw.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-10 13:30:04 -07:00
Junio C Hamano	1e2600dd6a	Merge branch 'nd/status-auto-comment-char' * nd/status-auto-comment-char: commit: allow core.commentChar=auto for character auto selection config: be strict on core.commentChar	2014-06-06 11:36:10 -07:00
Junio C Hamano	610a14f643	Merge branch 'jk/squelch-compiler-warning-from-funny-error-macro' * jk/squelch-compiler-warning-from-funny-error-macro: let clang use the constant-return error() macro inline constant return from error() function	2014-06-06 11:21:36 -07:00
Junio C Hamano	e1857af923	Merge branch 'jk/commit-date-approxidate' * jk/commit-date-approxidate: commit: accept more date formats for "--date" commit: print "Date" line when the user has set date pretty: make show_ident_date public commit: use split_ident_line to compare author/committer	2014-06-03 12:06:46 -07:00
Junio C Hamano	9af098c29b	Merge branch 'ym/fix-opportunistic-index-update-race' Read-only operations such as "git status" that internally refreshes the index write out the refreshed index to the disk to optimize future accesses to the working tree, but this could race with a "read-write" operation that modify the index while it is running. Detect such a race and avoid overwriting the index. Duy raised a good point that we may need to do the same for the normal writeout codepath, not just the "opportunistic" update codepath. While that is true, nobody sane would be running two simultaneous operations that are clearly write-oriented competing with each other against the same index file. So in that sense that can be done as a less urgent follow-up for this topic. * ym/fix-opportunistic-index-update-race: read-cache.c: verify index file before we opportunistically update it wrapper.c: add xpread() similar to xread()	2014-06-03 12:06:41 -07:00
Junio C Hamano	8eaf517835	Merge branch 'ks/tree-diff-nway' Instead of running N pair-wise diff-trees when inspecting a N-parent merge, find the set of paths that were touched by walking N+1 trees in parallel. These set of paths can then be turned into N pair-wise diff-tree results to be processed through rename detections and such. And N=2 case nicely degenerates to the usual 2-way diff-tree, which is very nice. * ks/tree-diff-nway: mingw: activate alloca combine-diff: speed it up, by using multiparent diff tree-walker directly tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Portable alloca for Git tree-diff: reuse base str(buf) memory on sub-tree recursion tree-diff: no need to call "full" diff_tree_sha1 from show_path() tree-diff: rework diff_tree interface to be sha1 based tree-diff: diff_tree() should now be static tree-diff: remove special-case diff-emitting code for empty-tree cases tree-diff: simplify tree_entry_pathcmp tree-diff: show_path prototype is not needed anymore tree-diff: rename compare_tree_entry -> tree_entry_pathcmp tree-diff: move all action-taking code out of compare_tree_entry() tree-diff: don't assume compare_tree_entry() returns -1,0,1 tree-diff: consolidate code for emitting diffs and recursion in one place tree-diff: show_tree() is not needed tree-diff: no need to pass match to skip_uninteresting() tree-diff: no need to manually verify that there is no mode change for a path combine-diff: move changed-paths scanning logic into its own function combine-diff: move show_log_first logic/action out of paths scanning	2014-06-03 12:06:40 -07:00
Nguyễn Thái Ngọc Duy	84c9dc2c5a	commit: allow core.commentChar=auto for character auto selection When core.commentChar is "auto", the comment char starts with '#' as in default but if it's already in the prepared message, find another char in a small subset. This should stop surprises because git strips some lines unexpectedly. Note that git is not smart enough to recognize '#' as the comment char in custom templates and convert it if the final comment char is different. It thinks '#' lines in custom templates as part of the commit message. So don't use this with custom templates. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-19 13:37:25 -07:00
Jeff King	ff0a80af72	let clang use the constant-return error() macro Commit `e208f9c` converted error() into a macro to make its constant return value more apparent to calling code. Commit `5ded807` prevents us using this macro with clang, since clang's -Wunused-value is smart enough to realize that the constant "-1" is useless in some contexts. However, since the last commit puts the constant behind an inline function call, this is enough to prevent the -Wunused-value warning on both modern gcc and clang. So we can now re-enable the macro when compiling with clang. Tested with clang 3.3, 3.4, and 3.5. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-06 15:30:40 -07:00
Jeff King	87fe5df365	inline constant return from error() function Commit `e208f9c` introduced a macro to turn error() calls into: (error(), -1) to make the constant return value more visible to the calling code (and thus let the compiler make better decisions about the code). This works well for code like: return error(...); but the "-1" is superfluous in code that just calls error() without caring about the return value. In older versions of gcc, that was fine, but gcc 4.9 complains with -Wunused-value. We can work around this by encapsulating the constant return value in a static inline function, as gcc specifically avoids complaining about unused function returns unless the function has been specifically marked with the warn_unused_result attribute. We also use the same trick for config_error_nonbool and opterror, which learned the same error technique in `a469a10`. Reported-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-06 15:30:38 -07:00
Jeff King	d105324655	pretty: make show_ident_date public We use this function internally to format "Date" lines in commit logs, but other parts of the code will want it, too. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-02 14:13:00 -07:00
Yiannis Marangos	426ddeead6	read-cache.c: verify index file before we opportunistically update it Before we proceed to opportunistically update the index (often done by an otherwise read-only operation like "git status" and "git diff" that internally refreshes the index), we must verify that the current index file is the same as the one that we read earlier before we took the lock on it, in order to avoid a possible race. In the example below git-status does "opportunistic update" and git-rebase updates the index, but the race can happen in general. 1. process A calls git-rebase (or does anything that uses the index) 2. process A applies 1st commit 3. process B calls git-status (or does anything that updates the index) 4. process B reads index 5. process A applies 2nd commit 6. process B takes the lock, then overwrites process A's changes. 7. process A applies 3rd commit As an end result the 3rd commit will have a revert of the 2nd commit. When process B takes the lock, it needs to make sure that the index hasn't changed since step 4. Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-10 12:27:58 -07:00
Kirill Smelkov	72441af7c4	tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-07 14:40:46 -07:00
Junio C Hamano	b6de0c633e	Merge branch 'nd/tag-version-sort' Allow v1.9.0 sorted before v1.10.0 in "git tag --list" output. * nd/tag-version-sort: tag: support --sort=<spec>	2014-03-21 12:47:39 -07:00
Junio C Hamano	8aac6c97e8	Merge branch 'jk/commit-dates-parsing-fix' into maint Codepaths that parse timestamps in commit objects have been tightened. * jk/commit-dates-parsing-fix: show_ident_date: fix tz range check log: do not segfault on gmtime errors log: handle integer overflow in timestamps date: check date overflow against time_t fsck: report integer overflow in author timestamps t4212: test bogus timestamps with git-log	2014-03-18 14:04:01 -07:00
Junio C Hamano	6d011b8e3f	Merge branch 'bk/refresh-missing-ok-in-merge-recursive' into maint "merge-recursive" was broken in 1.7.7 era and stopped working in an empty (temporary) working tree, when there are renames involved. This has been corrected. * bk/refresh-missing-ok-in-merge-recursive: merge-recursive.c: tolerate missing files while refreshing index read-cache.c: extend make_cache_entry refresh flag with options read-cache.c: refactor --ignore-missing implementation t3030-merge-recursive: test known breakage with empty work tree	2014-03-18 14:02:38 -07:00
Junio C Hamano	3e30cb0fbf	Merge branch 'mh/replace-refs-variable-rename' * mh/replace-refs-variable-rename: Document some functions defined in object.c Add docstrings for lookup_replace_object() and do_lookup_replace_object() rename read_replace_refs to check_replace_refs	2014-03-14 14:27:06 -07:00
Junio C Hamano	060be00621	Merge branch 'mh/object-code-cleanup' * mh/object-code-cleanup: sha1_file.c: document a bunch of functions defined in the file sha1_file_name(): declare to return a const string find_pack_entry(): document last_found_pack replace_object: use struct members instead of an array	2014-03-14 14:26:29 -07:00
Junio C Hamano	3c83b080e4	Merge branch 'jk/commit-dates-parsing-fix' Tighten codepaths that parse timestamps in commit objects. * jk/commit-dates-parsing-fix: show_ident_date: fix tz range check log: do not segfault on gmtime errors log: handle integer overflow in timestamps date: check date overflow against time_t fsck: report integer overflow in author timestamps t4212: test bogus timestamps with git-log	2014-03-14 14:25:44 -07:00
Junio C Hamano	08f36302b5	Merge branch 'ks/config-file-stdin' "git config" learned to read from the standard input when "-" is given as the value to its "--file" parameter (attempting an operation to update the configuration in the standard input of course is rejected). * ks/config-file-stdin: config: teach "git config --file -" to read from the standard input config: change git_config_with_options() interface builtin/config.c: rename check_blob_write() -> check_write() config: disallow relative include paths from blobs	2014-03-14 14:24:40 -07:00
Junio C Hamano	053a6b1807	Merge branch 'jn/add-2.0-u-A-sans-pathspec' "git add -u" and "git add -A" without any pathspec is a tree-wide operation now, even when they are run in a subdirectory of the working tree.	2014-03-07 15:14:02 -08:00
Junio C Hamano	4c4ac4db2c	Merge branch 'nd/daemonize-gc' Allow running "gc --auto" in the background. * nd/daemonize-gc: gc: config option for running --auto in background daemon: move daemonize() to libgit.a	2014-03-05 15:06:39 -08:00
Michael Haggerty	1f91e79cf6	Add docstrings for lookup_replace_object() and do_lookup_replace_object() Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-28 13:17:56 -08:00
Nguyễn Thái Ngọc Duy	9ef176b55c	tag: support --sort=<spec> --sort=version:refname (or --sort=v:refname for short) sorts tags as if they are versions. --sort=-refname reverses the order (with or without ":version"). versioncmp() is copied from string/strverscmp.c in glibc commit ee9247c38a8def24a59eb5cfb7196a98bef8cfdc, reformatted to Git coding style. The implementation is under LGPL-2.1 and according to [1] I can relicense it to GPLv2. [1] http://www.gnu.org/licenses/gpl-faq.html#AllCompatibility Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-27 14:04:05 -08:00
Junio C Hamano	0f9e62e084	Merge branch 'jk/pack-bitmap' Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...	2014-02-27 14:01:48 -08:00
Junio C Hamano	8336832ad9	Merge branch 'nd/reset-intent-to-add' * nd/reset-intent-to-add: reset: support "--mixed --intent-to-add" mode	2014-02-27 14:01:40 -08:00
Junio C Hamano	cbaeafc325	Merge branch 'nd/submodule-pathspec-ending-with-slash' Allow "git cmd path/", when the 'path' is where a submodule is bound to the top-level working tree, to match 'path', despite the extra and unnecessary trailing slash. * nd/submodule-pathspec-ending-with-slash: clean: use cache_name_is_other() clean: replace match_pathspec() with dir_path_match() pathspec: pass directory indicator to match_pathspec_item() match_pathspec: match pathspec "foo/" against directory "foo" dir.c: prepare match_pathspec_item for taking more flags pathspec: rename match_pathspec_depth() to match_pathspec() pathspec: convert some match_pathspec_depth() to dir_path_match() pathspec: convert some match_pathspec_depth() to ce_path_match()	2014-02-27 14:01:15 -08:00
Junio C Hamano	156d6ed922	Merge branch 'bk/refresh-missing-ok-in-merge-recursive' Allow "merge-recursive" to work in an empty (temporary) working tree again when there are renames involved, correcting an old regression in 1.7.7 era. * bk/refresh-missing-ok-in-merge-recursive: merge-recursive.c: tolerate missing files while refreshing index read-cache.c: extend make_cache_entry refresh flag with options read-cache.c: refactor --ignore-missing implementation t3030-merge-recursive: test known breakage with empty work tree	2014-02-27 14:01:14 -08:00
Junio C Hamano	d637d1b9a8	Merge branch 'kb/fast-hashmap' Improvements to our hash table to get it to meet the needs of the msysgit fscache project, with some nice performance improvements. * kb/fast-hashmap: name-hash: retire unused index_name_exists() hashmap.h: use 'unsigned int' for hash-codes everywhere test-hashmap.c: drop unnecessary #includes .gitignore: test-hashmap is a generated file read-cache.c: fix memory leaks caused by removed cache entries builtin/update-index.c: cleanup update_one fix 'git update-index --verbose --again' output remove old hash.[ch] implementation name-hash.c: remove cache entries instead of marking them CE_UNHASHED name-hash.c: use new hash map implementation for cache entries name-hash.c: remove unreferenced directory entries name-hash.c: use new hash map implementation for directories diffcore-rename.c: use new hash map implementation diffcore-rename.c: simplify finding exact renames diffcore-rename.c: move code around to prepare for the next patch buitin/describe.c: use new hash map implementation add a hashtable implementation that supports O(1) removal submodule: don't access the .gitmodules cache entry after removing it	2014-02-27 14:01:09 -08:00
Michael Haggerty	d40d535b89	sha1_file.c: document a bunch of functions defined in the file Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 16:01:11 -08:00

1 2 3 4 5 ...

1299 Commits