git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	0bbab7d2ab	Merge branch 'rs/lose-leak-pending' API clean-up around revision traversal. * rs/lose-leak-pending: commit: remove unused function clear_commit_marks_for_object_array() revision: remove the unused flag leak_pending checkout: avoid using the rev_info flag leak_pending bundle: avoid using the rev_info flag leak_pending bisect: avoid using the rev_info flag leak_pending object: add clear_commit_marks_all() ref-filter: use clear_commit_marks_many() in do_merge_filter() commit: use clear_commit_marks_many() in remove_redundant() commit: avoid allocation in clear_commit_marks_many()	2018-01-23 13:16:36 -08:00
Thomas Gummerer	a125a22334	read-cache: fix reading the shared index for other repos read_index_from() takes a path argument for the location of the index file. For reading the shared index in split index mode however it just ignores that path argument, and reads it from the gitdir of the current repository. This works as long as an index in the_repository is read. Once that changes, such as when we read the index of a submodule, or of a different working tree than the current one, the gitdir of the_repository will no longer contain the appropriate shared index, and git will fail to read it. For example t3007-ls-files-recurse-submodules.sh was broken with GIT_TEST_SPLIT_INDEX set in `188dce131f` ("ls-files: use repository object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also broken in a similar manner, probably by introducing struct repository there, although I didn't track down the exact commit for that. `be489d02d2` ("revision.c: --indexed-objects add objects from all worktrees", 2017-08-23) breaks with split index mode in a similar manner, not erroring out when it can't read the index, but instead carrying on with pruning, without taking the index of the worktree into account. Fix this by passing an additional gitdir parameter to read_index_from, to indicate where it should look for and read the shared index from. read_cache_from() defaults to using the gitdir of the_repository. As it is mostly a convenience macro, having to pass get_git_dir() for every call seems overkill, and if necessary users can have more control by using read_index_from(). Helped-by: Brandon Williams <bmwill@google.com> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-19 10:36:34 -08:00
Stefan Beller	15af58c1ad	diffcore: add a pickaxe option to find a specific blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) One might be tempted to extend git-describe to also work with blobs, such that `git describe <blob-id>` gives a description as '<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer number of responses (>110), it turns out this is tricky to get right. The hard part to get right is picking the correct 'commit-ish' as that could be the commit that (re-)introduced the blob or the blob that removed the blob; the blob could exist in different branches. Junio hinted at a different approach of solving this problem, which this patch implements. Teach the diff machinery another flag for restricting the information to what is shown. For example: $ ./git log --oneline --find-object=v2.0.0:Makefile `b2feb64309` Revert the whole "ask curl-config" topic for now `47fbfded53` i18n: only extract comments marked with "TRANSLATORS:" we observe that the Makefile as shipped with 2.0 was appeared in v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The reason why these commits both occur prior to v2.0.0 are evil merges that are not found using this new mechanism. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	cf63051ada	diff: introduce DIFF_PICKAXE_KINDS_MASK Currently the check whether to perform pickaxing is done via checking `diffopt->pickaxe`, which contains the command line argument that we want to pickaxe for. Soon we'll introduce a new type of pickaxing, that will not store anything in the `.pickaxe` field, so let's migrate the check to be dependent on pickaxe_opts. It is not enough to just replace the check for pickaxe by pickaxe_opts, because flags might be set, but pickaxing was not requested ('-i'). To cope with that, introduce a mask to check only for the bits indicating the modes of operation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	c1ddc4610c	diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit Currently flags for pickaxing are found in different places. Unify the flags into the `pickaxe_opts` field, which will contain any pickaxe related flags. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Junio C Hamano	556de1a8e3	Merge branch 'sb/describe-blob' "git describe" was taught to dig trees deeper to find a <commit-ish>:<path> that refers to a given blob object. * sb/describe-blob: builtin/describe.c: describe a blob builtin/describe.c: factor out describe_commit builtin/describe.c: print debug statements earlier builtin/describe.c: rename `oid` to avoid variable shadowing revision.h: introduce blob/tree walking in order of the commits list-objects.c: factor out traverse_trees_and_blobs t6120: fix typo in test name	2017-12-28 14:08:50 -08:00
René Scharfe	f1230fb5fc	revision: remove the unused flag leak_pending Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-28 13:50:05 -08:00
Jonathan Tan	df11e19648	rev-list: support termination at promisor objects Teach rev-list to support termination of an object traversal at any object from a promisor remote (whether one that the local repo also has, or one that the local repo knows about because it has another promisor object that references it). This will be used subsequently in gc and in the connectivity check used by fetch. For efficiency, if an object is referenced by a promisor object, and is in the local repo only as a non-promisor object, object traversal will not stop there. This is to avoid building the list of promisor object references. (In list-objects.c, the case where obj is NULL in process_blob() and process_tree() do not need to be changed because those happen only when there is a conflict between the expected type and the existing object. If the object doesn't exist, an object will be synthesized, which is fine.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:52:42 -08:00
Rafael Ascensão	65516f586b	log: add option to choose which refs to decorate When `log --decorate` is used, git will decorate commits with all available refs. While in most cases this may give the desired effect, under some conditions it can lead to excessively verbose output. Introduce two command line options, `--decorate-refs=<pattern>` and `--decorate-refs-exclude=<pattern>` to allow the user to select which refs are used in decoration. When "--decorate-refs=<pattern>" is given, only the refs that match the pattern are used in decoration. The refs that match the pattern when "--decorate-refs-exclude=<pattern>" is given, are never used in decoration. These options follow the same convention for mixing negative and positive patterns across the system, assuming that the inclusive default is to match all refs available. (1) if there is no positive pattern given, pretend as if an inclusive default positive pattern was given; (2) for each candidate, reject it if it matches no positive pattern, or if it matches any one of the negative patterns. The rules for what is considered a match are slightly different from the rules used elsewhere. Commands like `log --glob` assume a trailing '/' when glob chars are not present in the pattern. This makes it difficult to specify a single ref. On the other hand, commands like `describe --match --all` allow specifying exact refs, but do not have the convenience of allowing "shorthand refs" like 'refs/heads' or 'heads' to refer to 'refs/heads/'. The commands introduced in this patch consider a match if: (a) the pattern contains globs chars, and regular pattern matching returns a match. (b) the pattern does not contain glob chars, and ref '<pattern>' exists, or if ref exists under '<pattern>/' This allows both behaviours (allowing single refs and shorthand refs) yet remaining compatible with existent commands. Helped-by: Kevin Daudt <me@ikke.info> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-22 13:18:59 +09:00
Stefan Beller	ce5b6f9be8	revision.h: introduce blob/tree walking in order of the commits The functionality to list tree objects in the order they were seen while traversing the commits will be used in one of the next commits, where we teach `git describe` to describe not only commits, but blobs, too. The change in list-objects.c is rather minimal as we'll be re-using the infrastructure put in place of the revision walking machinery. For example one could expect that add_pending_tree is not called, but rather commit->tree is directly passed to the tree traversal function. This however requires a lot more code than just emptying the queue containing trees after each commit. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-16 11:12:51 +09:00
Junio C Hamano	8cc633286a	Merge branch 'bw/diff-opt-impl-to-bitfields' A single-word "unsigned flags" in the diff options is being split into a structure with many bitfields. * bw/diff-opt-impl-to-bitfields: diff: make struct diff_flags members lowercase diff: remove DIFF_OPT_CLR macro diff: remove DIFF_OPT_SET macro diff: remove DIFF_OPT_TST macro diff: remove touched flags diff: add flag to indicate textconv was set via cmdline diff: convert flags to be stored in bitfields add, reset: use DIFF_OPT_SET macro to set a diff flag	2017-11-09 14:31:27 +09:00
Junio C Hamano	f4c214b529	Merge branch 'jk/revision-pruning-optim' Pathspec-limited revision traversal was taught not to keep finding unneeded differences once it knows two trees are different inside given pathspec. * jk/revision-pruning-optim: revision: quit pruning diff more quickly when possible	2017-11-06 14:24:26 +09:00
Brandon Williams	0d1e0e7801	diff: make struct diff_flags members lowercase Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:40 +09:00
Brandon Williams	b2100e5291	diff: remove DIFF_OPT_CLR macro Remove the `DIFF_OPT_CLR` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_CLR(&E, fld) + E.flags.fld = 0 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_CLR(ptr, fld) + ptr->flags.fld = 0 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:30 +09:00
Brandon Williams	23dcf77f48	diff: remove DIFF_OPT_SET macro Remove the `DIFF_OPT_SET` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_SET(&E, fld) + E.flags.fld = 1 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_SET(ptr, fld) + ptr->flags.fld = 1 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Brandon Williams	3b69daed86	diff: remove DIFF_OPT_TST macro Remove the `DIFF_OPT_TST` macro and instead access the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_TST(&E, fld) + E.flags.fld @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_TST(ptr, fld) + ptr->flags.fld Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Jeff King	a937b37e76	revision: quit pruning diff more quickly when possible When the revision traversal machinery is given a pathspec, we must compute the parent-diff for each commit to determine which ones are TREESAME. We set the QUICK diff flag to avoid looking at more entries than we need; we really just care whether there are any changes at all. But there is one case where we want to know a bit more: if --remove-empty is set, we care about finding cases where the change consists only of added entries (in which case we may prune the parent in try_to_simplify_commit()). To cover that case, our file_add_remove() callback does not quit the diff upon seeing an added entry; it keeps looking for other types of entries. But this means when --remove-empty is not set (and it is not by default), we compute more of the diff than is necessary. You can see this in a pathological case where a commit adds a very large number of entries, and we limit based on a broad pathspec. E.g.: perl -e ' chomp(my $blob = `git hash-object -w --stdin </dev/null`); for my $a (1..1000) { for my $b (1..1000) { print "100644 $blob\t$a/$b\n"; } } ' \| git update-index --index-info git commit -qm add git rev-list HEAD -- . This case takes about 100ms now, but after this patch only needs 6ms. That's not a huge improvement, but it's easy to get and it protects us against even more pathological cases (e.g., going from 1 million to 10 million files would take ten times as long with the current code, but not increase at all after this patch). This is reported to minorly speed-up pathspec limiting in real world repositories (like the 100-million-file Windows repository), but probably won't make a noticeable difference outside of pathological setups. This patch actually covers the case without --remove-empty, and the case where we see only deletions. See the in-code comment for details. Note that we have to add a new member to the diff_options struct so that our callback can see the value of revs->remove_empty_trees. This callback parameter could be passed to the "add_remove" and "change" callbacks, but there's not much point. They already receive the diff_options struct, and doing it this way avoids having to update the function signature of the other callbacks (arguably the format_callback and output_prefix functions could benefit from the same simplification). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-14 11:43:49 +09:00
Junio C Hamano	14a8168e2f	Merge branch 'rj/no-sign-compare' Many codepaths have been updated to squelch -Wsign-compare warnings. * rj/no-sign-compare: ALLOC_GROW: avoid -Wsign-compare warnings cache.h: hex2chr() - avoid -Wsign-compare warnings commit-slab.h: avoid -Wsign-compare warnings git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings	2017-09-29 11:23:42 +09:00
Junio C Hamano	73ecdc606e	Merge branch 'rs/resolve-ref-optional-result' Code clean-up. * rs/resolve-ref-optional-result: refs: pass NULL to resolve_ref_unsafe() if hash is not needed refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional	2017-09-28 14:47:56 +09:00
René Scharfe	744c040b19	refs: pass NULL to resolve_ref_unsafe() if hash is not needed This allows us to get rid of some write-only variables, among them seven SHA1 buffers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 10:18:21 +09:00
Ramsay Jones	071bcaab64	ALLOC_GROW: avoid -Wsign-compare warnings Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-22 13:21:11 +09:00
Jeff King	7fa3c2ad6d	revision: replace "struct cmdline_pathspec" with argv_array We assemble an array of strings in a custom struct, NULL-terminate the result, and then pass it to parse_pathspec(). But then we never free the array or the individual strings (nor can we do the latter, as they are heap-allocated when they come from stdin but not when they come from the passed-in argv). Let's swap this out for an argv_array. It does the same thing with fewer lines of code, and it's safe to call argv_array_clear() at the end to avoid a memory leak. Reported-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-21 13:09:46 +09:00
Junio C Hamano	8a044c7f1d	Merge branch 'nd/prune-in-worktree' "git gc" and friends when multiple worktrees are used off of a single repository did not consider the index and per-worktree refs of other worktrees as the root for reachability traversal, making objects that are in use only in other worktrees to be subject to garbage collection. * nd/prune-in-worktree: refs.c: reindent get_submodule_ref_store() refs.c: remove fallback-to-main-store code get_submodule_ref_store() rev-list: expose and document --single-worktree revision.c: --reflog add HEAD reflog from all worktrees files-backend: make reflog iterator go through per-worktree reflog revision.c: --all adds HEAD from all worktrees refs: remove dead for_each__submodule() refs.c: move for_each_remote_ref_submodule() to submodule.c revision.c: use refs_for_each() instead of for_each_*_submodule() refs: add refs_head_ref() refs: move submodule slash stripping code to get_submodule_ref_store refs.c: refactor get_submodule_ref_store(), share common free block revision.c: --indexed-objects add objects from all worktrees revision.c: refactor add_index_objects_to_pending() refs.c: use is_dir_sep() in resolve_gitlink_ref() revision.h: new flag in struct rev_info wrt. worktree-related refs	2017-09-19 10:47:53 +09:00
Junio C Hamano	c2a3bb47f0	Merge branch 'jk/rev-list-empty-input' into maint "git log --tag=no-such-tag" showed log starting from HEAD, which has been fixed---it now shows nothing. * jk/rev-list-empty-input: revision: do not fallback to default when rev_input_given is set rev-list: don't show usage when we see empty ref patterns revision: add rev_input_given flag t6018: flesh out empty input/output rev-list tests	2017-09-10 17:02:48 +09:00
Nguyễn Thái Ngọc Duy	32619f99f9	rev-list: expose and document --single-worktree Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 14:58:50 -07:00
Nguyễn Thái Ngọc Duy	acd9544a8f	revision.c: --reflog add HEAD reflog from all worktrees Note that add_other_reflogs_to_pending() is a bit inefficient, since it scans reflog for all refs of each worktree, including shared refs, so the shared ref's reflog is scanned over and over again. We could update refs API to pass "per-worktree only" flag to avoid that. But long term we should be able to obtain a "per-worktree only" ref store and would need to revert the changes in reflog iteration API. So let's just wait until then. add_reflogs_to_pending() is called by reachable.c so by default "git prune" will examine reflog from all worktrees. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 14:58:47 -07:00
Nguyễn Thái Ngọc Duy	d0c39a49cc	revision.c: --all adds HEAD from all worktrees Unless single_worktree is set, --all now adds HEAD from all worktrees. Since reachable.c code does not use setup_revisions(), we need to call other_head_refs_submodule() explicitly there to have the same effect on "git prune", so that we won't accidentally delete objects needed by some other HEADs. A new FIXME is added because we would need something like int refs_other_head_refs(struct ref_store , each_ref_fn, cb_data); in addition to other_head_refs() to handle it, which might require int get_submodule_worktrees(const char submodule, int flags); It could be a separate topic to reduce the scope of this one. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 14:56:43 -07:00
Nguyễn Thái Ngọc Duy	073cf63c52	revision.c: use refs_for_each() instead of for_each__submodule() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 14:52:33 -07:00
Nguyễn Thái Ngọc Duy	be489d02d2	revision.c: --indexed-objects add objects from all worktrees This is the result of single_worktree flag never being set (no way to up until now). To get objects from current index only, set single_worktree. The other add_index_objects_to_pending's caller is mark_reachable_objects() (e.g. "git prune") which also mark objects from all indexes. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 14:44:41 -07:00
Nguyễn Thái Ngọc Duy	6c3d818154	revision.c: refactor add_index_objects_to_pending() The core code is factored out and take 'struct index_state *' instead so that we can reuse it to add objects from index files other than .git/index in the next patch. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 14:42:45 -07:00
Jonathan Tan	150e3001d0	pack: move has_sha1_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Junio C Hamano	0cb526e031	Merge branch 'jk/reflog-walk' into maint Numerous bugs in walking of reflogs via "log -g" and friends have been fixed. * jk/reflog-walk: reflog-walk: apply --since/--until to reflog dates reflog-walk: stop using fake parents rev-list: check reflog_info before showing usage get_revision_1(): replace do-while with an early return log: do not free parents when walking reflog log: clarify comment about reflog cycles revision: disallow reflog walking with revs->limited t1414: document some reflog-walk oddities	2017-08-23 14:33:44 -07:00
Junio C Hamano	8fbaf0b13b	Merge branch 'jk/rev-list-empty-input' "git log --tag=no-such-tag" showed log starting from HEAD, which has been fixed---it now shows nothing. * jk/rev-list-empty-input: revision: do not fallback to default when rev_input_given is set rev-list: don't show usage when we see empty ref patterns revision: add rev_input_given flag t6018: flesh out empty input/output rev-list tests	2017-08-11 13:27:07 -07:00
Junio C Hamano	3ab01ac3f7	Merge branch 'jk/reflog-walk' Numerous bugs in walking of reflogs via "log -g" and friends have been fixed. * jk/reflog-walk: reflog-walk: apply --since/--until to reflog dates reflog-walk: stop using fake parents rev-list: check reflog_info before showing usage get_revision_1(): replace do-while with an early return log: do not free parents when walking reflog log: clarify comment about reflog cycles revision: disallow reflog walking with revs->limited t1414: document some reflog-walk oddities	2017-08-11 13:27:01 -07:00
Junio C Hamano	df422678a8	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ sha1_name: convert GET_SHA1* flags to GET_OID* sha1_name: convert get_sha1* to get_oid* Convert remaining callers of get_sha1 to get_oid. builtin/unpack-file: convert to struct object_id bisect: convert bisect_checkout to struct object_id builtin/update_ref: convert to struct object_id sequencer: convert to struct object_id remote: convert struct push_cas to struct object_id submodule: convert submodule config lookup to use object_id builtin/merge-tree: convert remaining caller of get_sha1 to object_id builtin/fsck: convert remaining caller of get_sha1 to object_id	2017-08-11 13:26:55 -07:00
Jeff King	5d34d1ac06	revision: do not fallback to default when rev_input_given is set If revs->def is set (as it is in "git log") and there are no pending objects after parsing the user's input, then we show whatever is in "def". But if the user _did_ ask for some input that just happened to be empty (e.g., "--glob" that does not match anything), showing the default revision is confusing. We should just show nothing, as that is what the user's request yielded. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-02 15:45:22 -07:00
Jeff King	7ba826290a	revision: add rev_input_given flag Normally a caller that invokes setup_revisions() has to check rev.pending to see if anything was actually queued for the traversal. But they can't tell the difference between two cases: 1. The user gave us no tip from which to start a traversal. 2. The user tried to give us tips via --glob, --all, etc, but their patterns ended up being empty. Let's set a flag in the rev_info struct that callers can use to tell the difference. We can set this from the init_all_refs_cb() function. That's a little funny because it's not exactly about initializing the "cb" struct itself. But that function is the common setup place for doing pattern traversals that is used by --glob, --all, etc. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-02 15:45:20 -07:00
brian m. carlson	321c89bf5f	sha1_name: convert GET_SHA1* flags to GET_OID* Convert the flags for get_oid_with_context and friends to use "OID" instead of "SHA1" in their names. This transform was made by running the following one-liner on the affected files: perl -pi -e 's/GET_SHA1/GET_OID/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-17 13:54:51 -07:00
brian m. carlson	e82caf384b	sha1_name: convert get_sha1* to get_oid* Now that all the callers of get_sha1 directly or indirectly use struct object_id, rename the functions starting with get_sha1 to start with get_oid. Convert the internals in sha1_name.c to use struct object_id as well, and eliminate explicit length checks where possible. Convert a use of 40 in get_oid_basic to GIT_SHA1_HEXSZ. Outside of sha1_name.c and cache.h, this transition was made with the following semantic patch: @@ expression E1, E2; @@ - get_sha1(E1, E2.hash) + get_oid(E1, &E2) @@ expression E1, E2; @@ - get_sha1(E1, E2->hash) + get_oid(E1, E2) @@ expression E1, E2; @@ - get_sha1_committish(E1, E2.hash) + get_oid_committish(E1, &E2) @@ expression E1, E2; @@ - get_sha1_committish(E1, E2->hash) + get_oid_committish(E1, E2) @@ expression E1, E2; @@ - get_sha1_treeish(E1, E2.hash) + get_oid_treeish(E1, &E2) @@ expression E1, E2; @@ - get_sha1_treeish(E1, E2->hash) + get_oid_treeish(E1, E2) @@ expression E1, E2; @@ - get_sha1_commit(E1, E2.hash) + get_oid_commit(E1, &E2) @@ expression E1, E2; @@ - get_sha1_commit(E1, E2->hash) + get_oid_commit(E1, E2) @@ expression E1, E2; @@ - get_sha1_tree(E1, E2.hash) + get_oid_tree(E1, &E2) @@ expression E1, E2; @@ - get_sha1_tree(E1, E2->hash) + get_oid_tree(E1, E2) @@ expression E1, E2; @@ - get_sha1_blob(E1, E2.hash) + get_oid_blob(E1, &E2) @@ expression E1, E2; @@ - get_sha1_blob(E1, E2->hash) + get_oid_blob(E1, E2) @@ expression E1, E2, E3, E4; @@ - get_sha1_with_context(E1, E2, E3.hash, E4) + get_oid_with_context(E1, E2, &E3, E4) @@ expression E1, E2, E3, E4; @@ - get_sha1_with_context(E1, E2, E3->hash, E4) + get_oid_with_context(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-17 13:54:51 -07:00
Junio C Hamano	eac97b438c	Merge branch 'ab/grep-lose-opt-regflags' Code cleanup. * ab/grep-lose-opt-regflags: grep: remove redundant REG_NEWLINE when compiling fixed regex grep: remove regflags from the public grep_opt API grep: remove redundant and verbose re-assignments to 0 grep: remove redundant "fixed" field re-assignment to 0 grep: adjust a redundant grep pattern type assignment grep: remove redundant double assignment to 0	2017-07-13 16:14:54 -07:00
Junio C Hamano	0c6435a4d6	Merge branch 'ab/wildmatch' Minor code cleanup. * ab/wildmatch: wildmatch: remove unused wildopts parameter	2017-07-10 13:42:51 -07:00
Jeff King	de239446b6	reflog-walk: apply --since/--until to reflog dates When doing a reflog walk, we use the commit's date to do any date limiting. In earlier versions of Git, this could lead to nonsense results, since a skipped commit would truncate the traversal. So a sequence like: git commit ... git checkout week-old-branch git checkout - git log -g --since=1.day.ago would stop at the week-old-branch, even though the "git commit" entry further back is still interesting. As of the prior commit, which uses a parent-less traversal of the reflog, you get the whole reflog minus any commits whose dates do not match the specified options. This is arguably useful, as you could scan the reflogs for commits that originated in a certain range. But more likely a user doing a reflog walk wants to limit based on the reflog entries themselves. You can simulate --until with: git log -g @{1.day.ago} but there's no way to ask Git to traverse only back to a certain date. E.g.: # show me reflog entries from the past day git log -g --since=1.day.ago This patch teaches the revision machinery to prefer the reflog entry dates to the commit dates when doing a reflog walk. Technically this is a change in behavior that affects plumbing, but the previous behavior was so buggy that it's unlikely anyone was relying on it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-09 10:00:48 -07:00
Jeff King	d08565bf2d	reflog-walk: stop using fake parents The reflog-walk system works by putting a ref's tip into the pending queue, and then "traversing" the reflog by pretending that the parent of each commit is the previous reflog entry. This causes a number of user-visible oddities, as documented in t1414 (and the commit message which introduced it). We can fix all of them in one go by replacing the fake-reflog system with a much simpler one: just keeping a list of reflogs to show, and walking through them entry by entry. The implementation is fairly straight-forward, but there are a few items to note: 1. We obviously must skip calling add_parents_to_list() when we are traversing reflogs, since we do not want to walk the original parents at all. As a result, we must call try_to_simplify_commit() ourselves. There are other parts of add_parents_to_list() we skip, as well, but none of them should matter for a reflog traversal: - We do not allow UNINTERESTING commits, nor symmetric ranges (and we bail when these are used with "-g"). - Using --source makes no sense, since we aren't traversing. The reflog selector shows the same information with more detail. - Using --first-parent is still sensible, since you may want to see the first-parent diff for each entry. But since we're not traversing, we don't need to cull the parent list here. 2. Since we now just walk the reflog entries themselves, rather than starting with the ref tip, we now look at the "new" field of each entry rather than the "old" (i.e., we are showing entries, not faking parents). This removes all of the tricky logic around skipping past root commits. But note that we have no way to show an entry with the null sha1 in its "new" field (because such a commit obviously does not exist). Normally this would not happen, since we delete reflogs along with refs, but there is one special case. When we rename the currently checked out branch, we write two reflog entries into the HEAD log: one where the commit goes away, and another where it comes back. Prior to this commit, we show both entries with identical reflog messages. After this commit, we show only the "comes back" entry. See the update in t3200 which demonstrates this. Arguably either is fine, as the whole double-entry thing is a bit hacky in the first place. And until a recent fix, we truncated the traversal in such a case anyway, which was _definitely_ wrong. 3. We show individual reflogs in order, but choose which reflog to show at each stage based on which has the most recent timestamp. This interleaves the output from multiple reflogs based on date order, which is probably what you'd want with limiting like "-n 30". Note that the implementation aims for simplicity. It does a linear walk over the reflog queue for each commit it pulls, which may perform badly if you interleave an enormous number of reflogs. That seems like an unlikely use case; if we did want to handle it, we could probably keep a priority queue of reflogs, ordered by the timestamp of their current tip entry. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-09 10:00:48 -07:00
Jeff King	7c2f08aa7a	get_revision_1(): replace do-while with an early return The get_revision_1() function tries to avoid entering its main loop at all when there are no commits to look at. But it's perfectly safe to call pop_commit() on an empty list (in which case it will return NULL). Switching to an early return from the loop lets us skip repeating the loop condition before we enter the do-while. That will get more important when we start pulling reflog-walk commits from a source besides the revs->commits queue, as that condition will get much more complicated. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-09 10:00:48 -07:00
Jeff King	82fd0f4a4b	revision: disallow reflog walking with revs->limited The reflog-walk code doesn't work with limit_list(). That function traverses down the real history graph, not the fake reflog history that get_revision() returns. So it's not going to actually examine all of the commits we're going to show, because we'd add them to the pending list only during the actual traversal. In practice this limitation doesn't really matter, because the options that require list-limiting generally need UNINTERESTING endpoints or symmetric ranges, which already are forbidden for reflog walks. Still, there are likely some corner cases that would behave oddly. We're better off to warn the user that we can't fulfill their request than to generate potentially wrong output. This will also make it easier to refactor the reflog-walking code, because it eliminates a whole area of corner cases we'd have to consider (that already don't work anyway). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-07 10:04:53 -07:00
Ævar Arnfjörð Bjarmason	07a3d41173	grep: remove regflags from the public grep_opt API Refactor calls to the grep machinery to always pass opt.ignore_case & opt.extended_regexp_option instead of setting the equivalent regflags bits. The bug fixed when making -i work with -P in commit `9e3cbc59d5` ("log: make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was really just plastering over the code smell which this change fixes. The reason for adding the extensive commentary here is that I discovered some subtle complexity in implementing this that really should be called out explicitly to future readers. Before this change we'd rely on the difference between `extended_regexp_option` and `regflags` to serve as a membrane between our preliminary parsing of grep.extendedRegexp and grep.patternType, and what we decided to do internally. Now that those two are the same thing, it's necessary to unset `extended_regexp_option` just before we commit in cases where both of those config variables are set. See `84befcd0a4` ("grep: add a grep.patternType configuration setting", 2012-08-03) for the code and documentation related to that. The explanation of why the if/else branches in grep_commit_pattern_type() are ordered the way they are exists in that commit message, but I think it's worth calling this subtlety out explicitly with a comment for future readers. Even though grep_commit_pattern_type() is the only caller of grep_set_pattern_type_option() it's simpler to reset the extended_regexp_option flag in the latter, since 2/3 branches in the former would otherwise need to reset it, this way we can do it in one place. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 10:06:24 -07:00
Junio C Hamano	5c83d850d0	Merge branch 'mh/packed-ref-store-prep' Bugfix for a topic that is (only) in 'master'. * mh/packed-ref-store-prep: for_each_bisect_ref(): don't trim refnames lock_packed_refs(): fix cache validity check	2017-06-26 14:09:29 -07:00
Ævar Arnfjörð Bjarmason	55d3426929	wildmatch: remove unused wildopts parameter Remove the unused wildopts placeholder struct from being passed to all wildmatch() invocations, or rather remove all the boilerplate NULL parameters. This parameter was added back in commit `9b3497cab9` ("wildmatch: rename constants and update prototype", 2013-01-01) as a placeholder for future use. Over 4 years later nothing has made use of it, let's just remove it. It can be added in the future if we find some reason to start using such a parameter. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-23 18:27:07 -07:00
Junio C Hamano	e77d58a94f	Merge branch 'sg/revision-parser-skip-prefix' Code clean-up. * sg/revision-parser-skip-prefix: revision.c: use skip_prefix() in handle_revision_pseudo_opt() revision.c: use skip_prefix() in handle_revision_opt() revision.c: stricter parsing of '--early-output' revision.c: stricter parsing of '--no-{min,max}-parents' revision.h: turn rev_info.early_output back into an unsigned int	2017-06-22 14:15:23 -07:00
Junio C Hamano	a6f38c109b	Merge branch 'bw/object-id' Conversion from uchar[20] to struct object_id continues. * bw/object-id: (33 commits) diff: rename diff_fill_sha1_info to diff_fill_oid_info diffcore-rename: use is_empty_blob_oid tree-diff: convert path_appendnew to object_id tree-diff: convert diff_tree_paths to struct object_id tree-diff: convert try_to_follow_renames to struct object_id builtin/diff-tree: cleanup references to sha1 diff-tree: convert diff_tree_sha1 to struct object_id notes-merge: convert write_note_to_worktree to struct object_id notes-merge: convert verify_notes_filepair to struct object_id notes-merge: convert find_notes_merge_pair_ps to struct object_id notes-merge: convert merge_from_diffs to struct object_id notes-merge: convert notes_merge* to struct object_id tree-diff: convert diff_root_tree_sha1 to struct object_id combine-diff: convert find_paths_* to struct object_id combine-diff: convert diff_tree_combined to struct object_id diff: convert diff_flush_patch_id to struct object_id patch-ids: convert to struct object_id diff: finish conversion for prepare_temp_file to struct object_id diff: convert reuse_worktree_file to struct object_id diff: convert fill_filespec to struct object_id ...	2017-06-19 12:38:44 -07:00
Junio C Hamano	ae7e4d4fed	Merge branch 'ab/pcre-v2' Update "perl-compatible regular expression" support to enable JIT and also allow linking with the newer PCRE v2 library. * ab/pcre-v2: grep: add support for PCRE v2 grep: un-break building with PCRE >= 8.32 without --enable-jit grep: un-break building with PCRE < 8.20 grep: un-break building with PCRE < 8.32 grep: add support for the PCRE v1 JIT API log: add -P as a synonym for --perl-regexp grep: skip pthreads overhead when using one thread grep: don't redundantly compile throwaway patterns under threading	2017-06-19 12:38:43 -07:00
Michael Haggerty	03df567fbf	for_each_bisect_ref(): don't trim refnames `for_each_bisect_ref()` is called by `for_each_bad_bisect_ref()` with a term "bad". This used to make it call `for_each_ref_in_submodule()` with a prefix "refs/bisect/bad". But the latter is the name of the reference that is being sought, so the empty string was being passed to the callback as the trimmed refname. Moreover, this questionable practice was turned into an error by `b9c8e7f2fb` prefix_ref_iterator: don't trim too much, 2017-05-22 It makes more sense (and agrees better with the documentation of `--bisect`) for the callers to receive the full reference names. So * Add a new function, `for_each_fullref_in_submodule()`, to the refs API. This plugs a gap in the existing functionality, analogous to `for_each_fullref_in()` but accepting a `submodule` argument. * Change `for_each_bad_bisect_ref()` to call the new function rather than `for_each_ref_in_submodule()`. * Add a test. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-18 22:13:42 -07:00
SZEDER Gábor	8b1d9136e1	revision.c: use skip_prefix() in handle_revision_pseudo_opt() Instead of starts_with() and a bunch of magic numbers. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-12 13:39:52 -07:00
SZEDER Gábor	479b3d9785	revision.c: use skip_prefix() in handle_revision_opt() Instead of starts_with() and a bunch of magic numbers. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-12 13:39:49 -07:00
SZEDER Gábor	dffc651ed1	revision.c: stricter parsing of '--early-output' The parsing of '--early-output' with or without its optional integer argument allowed bogus options like '--early-output-foobarbaz' to slip through and be ignored. Fix it by parsing '--early-output' in the same way as other options with an optional argument are parsed. Furthermore, use strtoul_ui() to parse the optional integer argument and to refuse negative numbers. While at it, use skip_prefix() instead of starts_with() and magic numbers. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-12 13:39:46 -07:00
SZEDER Gábor	9ada7aee19	revision.c: stricter parsing of '--no-{min,max}-parents' These two options are parsed using starts_with(), allowing things like 'git log --no-min-parents-foobarbaz' to succeed. Use strcmp() instead. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-12 13:39:43 -07:00
Brandon Williams	66f414f885	diff-tree: convert diff_tree_sha1 to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:23:58 +09:00
Junio C Hamano	36dcb57337	Merge branch 'ab/grep-preparatory-cleanup' The internal implementation of "git grep" has seen some clean-up. * ab/grep-preparatory-cleanup: (31 commits) grep: assert that threading is enabled when calling grep_{lock,unlock} grep: given --threads with NO_PTHREADS=YesPlease, warn pack-objects: fix buggy warning about threads pack-objects & index-pack: add test for --threads warning test-lib: add a PTHREADS prerequisite grep: move is_fixed() earlier to avoid forward declaration grep: change internal pcre variable & function names to be pcre1 grep: change the internal PCRE macro names to be PCRE1 grep: factor test for \0 in grep patterns into a function grep: remove redundant regflags assignments grep: catch a missing enum in switch statement perf: add a comparison test of log --grep regex engines with -F perf: add a comparison test of log --grep regex engines perf: add a comparison test of grep regex engines with -F perf: add a comparison test of grep regex engines perf: emit progress output when unpacking & building perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do grep: add tests to fix blind spots with \0 patterns grep: prepare for testing binary regexes containing rx metacharacters grep: add a test helper function for less verbose -f \0 tests ...	2017-06-02 15:06:06 +09:00
Junio C Hamano	7ef0d04738	Merge branch 'jk/diff-blob' The result from "git diff" that compares two blobs, e.g. "git diff $commit1:$path $commit2:$path", used to be shown with the full object name as given on the command line, but it is more natural to use the $path in the output and use it to look up .gitattributes. * jk/diff-blob: diff: use blob path for blob/file diffs diff: use pending "path" if it is available diff: use the word "path" instead of "name" for blobs diff: pass whole pending entry in blobinfo handle_revision_arg: record paths for pending objects handle_revision_arg: record modes for "a..b" endpoints t4063: add tests of direct blob diffs get_sha1_with_context: dynamically allocate oc->path get_sha1_with_context: always initialize oc->symlink_path sha1_name: consistently refer to object_context as "oc" handle_revision_arg: add handle_dotdot() helper handle_revision_arg: hoist ".." check out of range parsing handle_revision_arg: stop using "dotdot" as a generic pointer handle_revision_arg: simplify commit reference lookups handle_revision_arg: reset "dotdot" consistently	2017-06-02 15:06:05 +09:00
Brandon Williams	94a0097a41	diff: convert diff_change to struct object_id Convert diff_change to take a struct object_id. In addition convert the function pointer type 'change_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Brandon Williams	c26022ea8f	diff: convert diff_addremove to struct object_id Convert diff_addremove to take a struct object_id. In addtion convert the function pointer type 'add_remove_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
brian m. carlson	fb61e4d3ab	notes: convert format_display_notes to struct object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:06 +09:00
Junio C Hamano	b42b41b75a	Merge branch 'jk/ignore-broken-tags-when-ignoring-missing-links' Tag objects, which are not reachable from any ref, that point at missing objects were mishandled by "git gc" and friends (they should silently be ignored instead) * jk/ignore-broken-tags-when-ignoring-missing-links: revision.c: ignore broken tags with ignore_missing_links	2017-05-29 12:34:54 +09:00
Junio C Hamano	6b526ced6f	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-29 12:34:43 +09:00
Ævar Arnfjörð Bjarmason	7531a2dd87	log: add -P as a synonym for --perl-regexp Add a short -P option as a synonym for the longer --perl-regexp, for consistency with the options the corresponding grep invocations accept. This was intentionally omitted in commit `727b6fc3ed` ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified future use. Make it consistent with "grep" rather than to keep it open for future use, and to avoid the confusion of -P meaning different things for grep & log, as is the case with the -G option. As noted in the aforementioned commit the --basic-regexp option can't have a corresponding -G argument, as the log command already uses that for -G<regex>. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:59:05 +09:00
Jeff King	18f1ad7639	handle_revision_arg: record paths for pending objects If the revision parser sees an argument like tree:path, we parse it down to the correct blob (or tree), but throw away the "path" portion. Let's ask get_sha1_with_context() to record it, and pass it along in the pending array. This will let programs like git-diff which rely on the revision-parser show more accurate paths. Note that the implementation is a little tricky; we have to make sure we free oc.path in all code paths. For handle_dotdot(), we can piggy-back on the existing cleanup-wrapper pattern. The real work happens in handle_dotdot_1(), but the handle_dotdot() wrapper makes sure that the path is freed no matter how we exit the function (and for that reason we make sure that the object_context struct is zero'd, so if we fail to even get to the get_sha1_with_context() call, we just end up calling free(NULL)). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	101dd4de16	handle_revision_arg: record modes for "a..b" endpoints The "a..b" revision syntax was designed to handle commits, so it doesn't bother to record any mode we find while traversing a "tree:path" endpoint. These days "git diff" can diff blobs using either "a:path..b:path" (with dots) or "a:path b:path" (without), but the two behave inconsistently, as the with-dots version fails to notice the mode. Let's teach the dot-dot range parser to record modes; it doesn't cost us anything, and it makes this case work. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	62faad5aa5	handle_revision_arg: add handle_dotdot() helper The handle_revision_arg function is rather long, and a big chunk of it is handling the range operators. Let's pull that out to a separate helper. While we're doing so, we can clean up a few of the rough edges that made the flow hard to follow: - instead of manually restoring *dotdot (that we overwrote with a NUL), do the real work in a sub-helper, which makes it clear that the munge/restore lines are a matched pair - eliminate a goto which wasn't actually used for control flow, but only to avoid duplicating a few lines (instead, those lines are pushed into another helper function) - use early returns instead of deep nesting - consistently name all variables for the left-hand side of the range as "a" (rather than "this" or "from") and the right-hand side as "b" (rather than "next", or using the unadorned "sha1" or "flags" from the main function). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	d89797feff	handle_revision_arg: hoist ".." check out of range parsing Since `003c84f6d` (specifying ranges: we did not mean to make ".." an empty set, 2011-05-02), we treat the argument ".." specially. We detect it by noticing that both sides of the range are empty, and that this is a non-symmetric two-dot range. While correct, this makes the code overly complicated. We can just detect ".." up front before we try to do further parsing. This avoids having to de-munge the NUL from dotdot, and lets us eliminate an extra const array (which we needed only to do direct pointer comparisons). It also removes the one code path from the range-parsing conditional that requires us to return -1. That will make it simpler to pull the dotdot parsing out into its own function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	f632dedd8d	handle_revision_arg: stop using "dotdot" as a generic pointer The handle_revision_arg() function has a "dotdot" variable that it uses to find a ".." or "..." in the argument. If we don't find one, we look for other marks, like "^!". But we just keep re-using the "dotdot" variable, which is confusing. Let's introduce a separate "mark" variable that can be used for these other marks. They still reuse the same variable, but at least the name is no longer actively misleading. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	1d6c93817b	handle_revision_arg: simplify commit reference lookups The "dotdot" range parser avoids calling lookup_commit_reference() if we are directly fed two commits. But its casts are unnecessarily complex; that function will just return a commit we pass into it. Just calling the function all the time is much simpler, and doesn't do any significant extra work (the object is already parsed, and deref_tag() on a non-tag is a noop; we do incur one extra lookup_object() call, but that's fairly trivial). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	ed79b2cf03	handle_revision_arg: reset "dotdot" consistently When we are parsing a range like "a..b", we write a temporary NUL over the first ".", so that we can access the names "a" and "b" as C strings. But our restoration of the original "." is done at inconsistent times, which can lead to confusing results. For most calls, we restore the "." after we resolve the names, but before we call verify_non_filename(). This means that when we later call add_pending_object(), the name for the left-hand "a" has been re-expanded to "a..b". You can see this with: git log --source a...b where "b" will be correctly marked with "b", but "a" will be marked with "a...b". Likewise with "a..b" (though you need to use --boundary to even see "a" at all in that case). To top off the confusion, when the REVARG_CANNOT_BE_FILENAME flag is set, we skip the non-filename check, and leave the NUL in place. That means we do report the correct name for "a" in the pending array. But some code paths try to show the whole "a...b" name in error messages, and these erroneously show only "a" instead of "a...b". E.g.: $ git cherry-pick HEAD:foo...HEAD:foo error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit fatal: Invalid symmetric difference expression HEAD:foo (That last message should be "HEAD:foo...HEAD:foo"; I used cherry-pick because it passes the CANNOT_BE_FILENAME flag). As an interesting side note, cherry-pick actually looks at and re-resolves the arguments from the pending->name fields. So it would have been visibly broken by the first bug, but the effect was canceled out by the second one. This patch makes the whole function consistent by re-writing the NUL immediately after calling verify_non_filename(), and then restoring the "." as appropriate in some error-printing and early-return code paths. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:03 +09:00
Junio C Hamano	ca7b2ab07d	Merge branch 'bc/object-id' * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-23 14:29:19 +09:00
Ævar Arnfjörð Bjarmason	9e3cbc59d5	log: make --regexp-ignore-case work with --perl-regexp Make the --regexp-ignore-case option work with --perl-regexp. This never worked, and there was no test for this. Fix the bug and add a test. When PCRE support was added in commit `63e7e9d8b6` ("git-grep: Learn PCRE", 2011-05-09) compile_pcre_regexp() would only check opt->ignore_case, but when the --perl-regexp option was added in commit `727b6fc3ed` ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03) the code didn't set the opt->ignore_case. Change the test suite to test for -i and --invert-regexp with basic/extended/perl patterns in addition to fixed, which was the only patternType that was tested for before in combination with those options. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-21 08:25:37 +09:00
Jeff King	a3ba6bf10a	revision.c: ignore broken tags with ignore_missing_links When peeling a tag for prepare_revision_walk(), we do not respect the ignore_missing_links flag. This can lead to a bogus error when pack-objects walks the possibly-broken unreachable-but-recent part of the object graph. The other link-following all happens via traverse_commit_list(), which explains why this case was missed. And our tests covered only broken links from commits. Let's be more comprehensive and cover broken tree entries (which do work) and tags (which shows off this bug). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-20 18:32:58 +09:00
brian m. carlson	c251c83df2	object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	654b9a905c	revision: convert remaining parse_object callers to object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	a58a1b01ff	revision: rename add_pending_sha1 to add_pending_oid Rename this function and convert it to take a pointer to struct object_id. This is a prerequisite for converting get_reference, which is needed to convert parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	740ee055c6	Convert lookup_tree to struct object_id Convert the lookup_tree function to take a pointer to struct object_id. The commit was created with manual changes to tree.c, tree.h, and object.c, plus the following semantic patch: @@ @@ - lookup_tree(EMPTY_TREE_SHA1_BIN) + lookup_tree(&empty_tree_oid) @@ expression E1; @@ - lookup_tree(E1.hash) + lookup_tree(&E1) @@ expression E1; @@ - lookup_tree(E1->hash) + lookup_tree(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	3aca1fc6c9	Convert lookup_blob to struct object_id Convert lookup_blob to take a pointer to struct object_id. The commit was created with manual changes to blob.c and blob.h, plus the following semantic patch: @@ expression E1; @@ - lookup_blob(E1.hash) + lookup_blob(&E1) @@ expression E1; @@ - lookup_blob(E1->hash) + lookup_blob(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	bc83266abe	Convert lookup_commit* to struct object_id Convert lookup_commit, lookup_commit_or_die, lookup_commit_reference, and lookup_commit_reference_gently to take struct object_id arguments. Introduce a temporary in parse_object buffer in order to convert this function. This is required since in order to convert parse_object and parse_object_buffer, lookup_commit_reference_gently and lookup_commit_or_die would need to be converted. Not introducing a temporary would therefore require that lookup_commit_or_die take a struct object_id , but lookup_commit would take unsigned char , leaving a confusing and hard-to-use interface. parse_object_buffer will lose this temporary in a later patch. This commit was created with manual changes to commit.c, commit.h, and object.c, plus the following semantic patch: @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1.hash, E2) + lookup_commit_reference_gently(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1->hash, E2) + lookup_commit_reference_gently(E1, E2) @@ expression E1; @@ - lookup_commit_reference(E1.hash) + lookup_commit_reference(&E1) @@ expression E1; @@ - lookup_commit_reference(E1->hash) + lookup_commit_reference(E1) @@ expression E1; @@ - lookup_commit(E1.hash) + lookup_commit(&E1) @@ expression E1; @@ - lookup_commit(E1->hash) + lookup_commit(E1) @@ expression E1, E2; @@ - lookup_commit_or_die(E1.hash, E2) + lookup_commit_or_die(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_or_die(E1->hash, E2) + lookup_commit_or_die(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	68ab61dd09	revision: convert prepare_show_merge to struct object_id This is a caller of lookup_commit_or_die, which we will convert later on. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	e0a9280404	Convert struct cache_tree to use struct object_id Convert the sha1 member of struct cache_tree to struct object_id by changing the definition and applying the following semantic patch, plus the standard object_id transforms: @@ struct cache_tree E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_tree *E1; @@ - E1->sha1 + E1->oid.hash Fix up one reference to active_cache_tree which was not automatically caught by Coccinelle. These changes are prerequisites for converting parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-02 10:46:41 +09:00
Johannes Schindelin	dddbad728c	timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-27 13:07:39 +09:00
Junio C Hamano	e1fae93019	Merge branch 'bc/object-id' "uchar [40]" to "struct object_id" conversion continues. * bc/object-id: wt-status: convert to struct object_id builtin/merge-base: convert to struct object_id Convert object iteration callbacks to struct object_id sha1_file: introduce an nth_packed_object_oid function refs: simplify parsing of reflog entries refs: convert each_reflog_ent_fn to struct object_id reflog-walk: convert struct reflog_info to struct object_id builtin/replace: convert to struct object_id Convert remaining callers of resolve_refdup to object_id builtin/merge: convert to struct object_id builtin/clone: convert to struct object_id builtin/branch: convert to struct object_id builtin/grep: convert to struct object_id builtin/fmt-merge-message: convert to struct object_id builtin/fast-export: convert to struct object_id builtin/describe: convert to struct object_id builtin/diff-tree: convert to struct object_id builtin/commit: convert to struct object_id hex: introduce parse_oid_hex	2017-03-17 13:50:25 -07:00
Jeff King	0e9f62dab9	interpret_branch_name: allow callers to restrict expansions The interpret_branch_name() function converts names like @{-1} and @{upstream} into branch names. The expanded ref names are not fully qualified, and may be outside of the refs/heads/ namespace (e.g., "@" expands to "HEAD", and "@{upstream}" is likely to be in "refs/remotes/"). This is OK for callers like dwim_ref() which are primarily interested in resolving the resulting name, no matter where it is. But callers like "git branch" treat the result as a branch name in refs/heads/. When we expand to a ref outside that namespace, the results are very confusing (e.g., "git branch @" tries to create refs/heads/HEAD, which is nonsense). Callers can't know from the returned string how the expansion happened (e.g., did the user really ask for a branch named "HEAD", or did we do a bogus expansion?). One fix would be to return some out-parameters describing the types of expansion that occurred. This has the benefit that the caller can generate precise error messages ("I understood @{upstream} to mean origin/master, but that is a remote tracking branch, so you cannot create it as a local name"). However, out-parameters make the function interface somewhat cumbersome. Instead, let's do the opposite: let the caller tell us which elements to expand. That's easier to pass in, and none of the callers give more precise error messages than "@{upstream} isn't a valid branch name" anyway (which should be sufficient). The strbuf_branchname() function needs a similar parameter, as most of the callers access interpret_branch_name() through it. We can break the callers down into two groups: 1. Callers that are happy with any kind of ref in the result. We pass "0" here, so they continue to work without restrictions. This includes merge_name(), the reflog handling in add_pending_object_with_path(), and substitute_branch_name(). This last is what powers dwim_ref(). 2. Callers that have funny corner cases (mostly in git-branch and git-checkout). These need to make use of the new parameter, but I've left them as "0" in this patch, and will address them individually in follow-on patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:05:04 -08:00
brian m. carlson	9461d27240	refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-22 10:12:15 -08:00
Junio C Hamano	8c98a68981	Merge branch 'vn/revision-shorthand-for-side-branch-log' "git log rev^..rev" is an often-used revision range specification to show what was done on a side branch merged at rev. This has gained a short-hand "rev^-1". In general "rev^-$n" is the same as "^rev^$n rev", i.e. what has happened on other branches while the history leading to nth parent was looking the other way. * vn/revision-shorthand-for-side-branch-log: revision: new rev^-n shorthand for rev^n..rev	2016-10-06 14:53:10 -07:00
Vegard Nossum	8779351dd7	revision: new rev^-n shorthand for rev^n..rev "git log rev^..rev" is commonly used to show all work done on and merged from a side branch. This patch introduces a shorthand "rev^-" for this and additionally allows "rev^-$n" to mean "reachable from rev, excluding what is reachable from the nth parent of rev". For example, for a two-parent merge, you can use rev^-2 to get the set of commits which were made to the main branch while the topic branch was prepared. Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-27 10:59:28 -07:00
brian m. carlson	99d1a9861a	cache: convert struct cache_entry to use struct object_id Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
Junio C Hamano	dd610aeda6	Merge branch 'kw/patch-ids-optim' When "git rebase" tries to compare set of changes on the updated upstream and our own branch, it computes patch-id for all of these changes and attempts to find matches. This has been optimized by lazily computing the full patch-id (which is expensive) to be compared only for changes that touch the same set of paths. * kw/patch-ids-optim: rebase: avoid computing unnecessary patch IDs patch-ids: add flag to create the diff patch id using header only data patch-ids: replace the seen indicator with a commit pointer patch-ids: stop using a hand-rolled hashmap implementation	2016-08-12 09:47:39 -07:00
Junio C Hamano	b422d99658	Merge branch 'jc/grep-commandline-vs-configuration' "git -c grep.patternType=extended log --basic-regexp" misbehaved because the internal API to access the grep machinery was not designed well. * jc/grep-commandline-vs-configuration: grep: further simplify setting the pattern type	2016-08-04 14:39:18 -07:00
Kevin Willford	683f17ec44	patch-ids: replace the seen indicator with a commit pointer The cherry_pick_list was looping through the original side checking the seen indicator and setting the cherry_flag on the commit. If we save off the commit in the patch_id we can set the cherry_flag on the correct commit when running through the other side when a patch_id match is found. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 13:23:03 -07:00
Junio C Hamano	8465541e8c	grep: further simplify setting the pattern type When `c5c31d33` (grep: move pattern-type bits support to top-level grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper function, the intention was to allow the users of grep API to having to fiddle only with .pattern_type_option (which can be set to "fixed", "basic", "extended", and "pcre"), and then immediately before compiling the pattern strings for use, call grep_commit_pattern_type() to have it prepare various bits in the grep_opt structure (like .fixed, .regflags, etc.). However, grep_set_pattern_type_option() helper function the grep API internally uses were left as an external function by mistake. This function shouldn't have been made callable by the users of the API. Later when the grep API was used in revision traversal machinery, the caller then mistakenly started calling the function around `34a4ae55` (log --grep: use the same helper to set -E/-F options as "git grep", 2012-10-03), instead of setting the .pattern_type_option field and letting the grep_commit_pattern_type() to take care of the details. This caused an unnecessary bug that made a configured grep.patternType take precedence over the command line options (e.g. --basic-regexp, --fixed-strings) in "git log" family of commands. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-25 09:16:18 -07:00
Junio C Hamano	369dc4081c	Merge branch 'mj/log-show-signature-conf' "git log" learns log.showSignature configuration variable, and a command line option "--no-show-signature" to countermand it. * mj/log-show-signature-conf: log: add log.showSignature configuration variable log: add "--no-show-signature" command line option t4202: refactor test	2016-07-11 10:31:08 -07:00
Mehul Jain	aa3799996c	log: add "--no-show-signature" command line option If an user creates an alias with "--show-signature" early in command line, e.g. [alias] logss = log --show-signature then there is no way to countermand it through command line. Teach git-log and related commands about "--no-show-signature" command line option. This will make "git logss --no-show-signature" run without showing GPG signature. Signed-off-by: Mehul Jain <mehul.jain2029@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-24 13:01:13 -07:00
Junio C Hamano	ed6e8038f9	pathspec: rename free_pathspec() to clear_pathspec() The function takes a pointer to a pathspec structure, and releases the resources held by it, but does not free() the structure itself. Such a function should be called "clear", not "free". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-02 14:09:22 -07:00
Junio C Hamano	8429f2b42d	Merge branch 'bc/object-id' Move from unsigned char[20] to struct object_id continues. * bc/object-id: match-trees: convert several leaf functions to use struct object_id tree-walk: convert tree_entry_extract() to use struct object_id struct name_entry: use struct object_id instead of unsigned char sha1[20] match-trees: convert shift_tree() and shift_tree_by() to use object_id test-match-trees: convert to use struct object_id sha1-name: introduce a get_oid() function	2016-05-06 14:45:44 -07:00
brian m. carlson	7d924c9139	struct name_entry: use struct object_id instead of unsigned char sha1[20] Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-25 14:23:42 -07:00
Junio C Hamano	cafef3d7ad	Merge branch 'lt/pretty-expand-tabs' When "git log" shows the log message indented by 4-spaces, the remainder of a line after a HT does not align in the way the author originally intended. The command now expands tabs by default in such a case, and allows the users to override it with a new option, '--no-expand-tabs'. * lt/pretty-expand-tabs: pretty: test --expand-tabs pretty: allow tweaking tabwidth in --expand-tabs pretty: enable --expand-tabs by default for selected pretty formats pretty: expand tabs in indented logs to make things line up properly	2016-04-13 14:12:36 -07:00
Junio C Hamano	fe37a9c586	pretty: allow tweaking tabwidth in --expand-tabs When the local convention of the project is to use tab width that is not 8, it may make sense to allow "git log --expand-tabs=<n>" to tweak the output to match it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-30 12:52:26 -07:00
Junio C Hamano	0893eec85f	pretty: enable --expand-tabs by default for selected pretty formats "git log --pretty={medium,full,fuller}" and "git log" by default prepend 4 spaces to the log message, so it makes sense to enable the new "expand-tabs" facility by default for these formats. Add --no-expand-tabs option to override the new default. The change alone breaks a test in t4201 that runs "git shortlog" on the output from "git log", and expects that the output from "git log" does not do such a tab expansion. Adjust the test to explicitly disable expand-tabs with --no-expand-tabs. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-30 12:39:29 -07:00
Linus Torvalds	7cc13c717b	pretty: expand tabs in indented logs to make things line up properly A commit log message sometimes tries to line things up using tabs, assuming fixed-width font with the standard 8-place tab settings. Viewing such a commit however does not work well in "git log", as we indent the lines by prefixing 4 spaces in front of them. This should all line up: Column 1 Column 2 -------- -------- A B ABCD EFGH SPACES Instead of Tabs Even with multi-byte UTF8 characters: Column 1 Column 2 -------- -------- Ä B åäö 100 A Møøse once bit my sister.. Tab-expand the lines in "git log --expand-tabs" output before prefixing 4 spaces. This is based on the patch by Linus Torvalds, but at this step, we require an explicit command line option to enable the behaviour. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-30 11:25:35 -07:00
Jeff King	2824e1841b	list-objects: pass full pathname to callbacks When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:04 -07:00
Jeff King	dc06dc8800	list-objects: drop name_path entirely In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:03 -07:00
Jeff King	f3badaed51	list-objects: convert name_path to a strbuf The "struct name_path" data is examined in only two places: we generate it in process_tree(), and we convert it to a single string in path_name(). Everyone else just passes it through to those functions. We can further note that process_tree() already keeps a single strbuf with the leading tree path, for use with tree_entry_interesting(). Instead of building a separate name_path linked list, let's just use the one we already build in "base". This reduces the amount of code (especially tricky code in path_name() which did not check for integer overflows caused by deep or large pathnames). It is also more efficient in some instances. Any time we were using tree_entry_interesting, we were building up the strbuf anyway, so this is an immediate and obvious win there. In cases where we were not, we trade off storing "pathname/" in a strbuf on the heap for each level of the path, instead of two pointers and an int on the stack (with one pointer into the tree object). On a 64-bit system, the latter is 20 bytes; so if path components are less than that on average, this has lower peak memory usage. In practice it probably doesn't matter either way; we are already holding in memory all of the tree objects leading up to each pathname, and for normal-depth pathnames, we are only talking about hundreds of bytes. This patch leaves "struct name_path" as a thin wrapper around the strbuf, to avoid disrupting callbacks. We should fix them, but leaving it out makes this diff easier to view. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:03 -07:00
Jeff King	8eee9f9277	show_object_with_name: simplify by using path_name() When "git rev-list" shows an object with its associated path name, it does so by walking the name_path linked list and printing each component (stopping at any embedded NULs or newlines). We'd like to eventually get rid of name_path entirely in favor of a single buffer, and dropping this custom printing code is part of that. As a first step, let's use path_name() to format the list into a single buffer, and print that. This is strictly less efficient than the original, but it's a temporary step in the refactoring; our end game will be to get the fully formatted name in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:03 -07:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Jeff King	50a6c8efa2	use st_add and st_mult for allocation size computation If our size computation overflows size_t, we may allocate a much smaller buffer than we expected and overflow it. It's probably impossible to trigger an overflow in most of these sites in practice, but it is easy enough convert their additions and multiplications into overflow-checking variants. This may be fixing real bugs, and it makes auditing the code easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Jeff King	de1e67d070	list-objects: pass full pathname to callbacks When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:17 -08:00
Jeff King	bd64516aca	list-objects: drop name_path entirely In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:15 -08:00
Jeff King	13528ab37c	list-objects: convert name_path to a strbuf The "struct name_path" data is examined in only two places: we generate it in process_tree(), and we convert it to a single string in path_name(). Everyone else just passes it through to those functions. We can further note that process_tree() already keeps a single strbuf with the leading tree path, for use with tree_entry_interesting(). Instead of building a separate name_path linked list, let's just use the one we already build in "base". This reduces the amount of code (especially tricky code in path_name() which did not check for integer overflows caused by deep or large pathnames). It is also more efficient in some instances. Any time we were using tree_entry_interesting, we were building up the strbuf anyway, so this is an immediate and obvious win there. In cases where we were not, we trade off storing "pathname/" in a strbuf on the heap for each level of the path, instead of two pointers and an int on the stack (with one pointer into the tree object). On a 64-bit system, the latter is 20 bytes; so if path components are less than that on average, this has lower peak memory usage. In practice it probably doesn't matter either way; we are already holding in memory all of the tree objects leading up to each pathname, and for normal-depth pathnames, we are only talking about hundreds of bytes. This patch leaves "struct name_path" as a thin wrapper around the strbuf, to avoid disrupting callbacks. We should fix them, but leaving it out makes this diff easier to view. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:10 -08:00
Jeff King	f9fb9d0e3c	show_object_with_name: simplify by using path_name() When "git rev-list" shows an object with its associated path name, it does so by walking the name_path linked list and printing each component (stopping at any embedded NULs or newlines). We'd like to eventually get rid of name_path entirely in favor of a single buffer, and dropping this custom printing code is part of that. As a first step, let's use path_name() to format the list into a single buffer, and print that. This is strictly less efficient than the original, but it's a temporary step in the refactoring; our end game will be to get the fully formatted name in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:08 -08:00
Junio C Hamano	02dab5d399	Merge branch 'nd/diff-with-path-params' into maint A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-05 14:54:15 -08:00
Junio C Hamano	c167a96e68	Merge branch 'nd/diff-with-path-params' A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-03 14:16:04 -08:00
Duy Nguyen	a97262c62f	diff: make -O and --output work in subdirectory Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-21 10:45:13 -08:00
Junio C Hamano	6e8d46f9d4	revision: read --stdin with strbuf_getline() Reading with getwholeline() and manually stripping the terminating '\n' would leave CR at the end of the line if the input comes from a DOS editor. Constrasting this with the other changes around "--stdin" in this series, one may realize that the way "log" family of commands read the paths with "--stdin" looks inconsistent and sloppy. It does not allow us to C-quote a textual input, neither does it accept records that are NUL-terminated. These are unfortunately way too late to fix X-<. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-15 10:33:28 -08:00
Junio C Hamano	e3073cf895	Merge branch 'jk/pending-keep-tag-name' into maint History traversal with "git log --source" that starts with an annotated tag failed to report the tag as "source", due to an old regression in the command line parser back in v2.2 days. * jk/pending-keep-tag-name: revision.c: propagate tag names from pending array	2016-01-04 14:03:08 -08:00
Junio C Hamano	71957339da	Merge branch 'jk/pending-keep-tag-name' History traversal with "git log --source" that starts with an annotated tag failed to report the tag as "source", due to an old regression in the command line parser back in v2.2 days. * jk/pending-keep-tag-name: revision.c: propagate tag names from pending array	2015-12-28 13:58:04 -08:00
Jeff King	728350b76a	revision.c: propagate tag names from pending array When we unwrap a tag to find its commit for a traversal, we do not propagate the "name" field of the tag in the pending array (i.e., the ref name the user gave us in the first place) to the commit (instead, we use an empty string). This means that "git log --source" will never show the tag-name for commits we reach through it. This was broken in `2073949` (traverse_commit_list: support pending blobs/trees with paths, 2014-10-15). That commit tried to be careful and avoid propagating the path information for a tag (which would be nonsensical) to trees and blobs. But it should not have cut off the "name" field, which should carry forward to children. Note that this does mean that the "name" field will carry forward to blobs and trees, too. Whereas prior to `2073949`, we always gave them an empty string. This is the right thing to do, but in practice no callers probably use it (since now we have an explicit separate "path" field, which was the point of `2073949`). We add tests here not only for the broken case, but also a basic sanity test of "log --source" in general, which did not have any coverage in the test suite. Reported-by: Raymundo <gypark@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-12-17 10:47:56 -08:00
Junio C Hamano	58e3dd21f6	Merge branch 'sn/null-pointer-arith-in-mark-tree-uninteresting' into maint mark_tree_uninteresting() has code to handle the case where it gets passed a NULL pointer in its 'tree' parameter, but the function had 'object = &tree->object' assignment before checking if tree is NULL. This gives a compiler an excuse to declare that tree will never be NULL and apply a wrong optimization. Avoid it. * sn/null-pointer-arith-in-mark-tree-uninteresting: revision.c: fix possible null pointer arithmetic	2015-12-11 11:14:38 -08:00
Junio C Hamano	0af22d6fff	Merge branch 'rs/pop-commit' into maint Code simplification. * rs/pop-commit: use pop_commit() for consuming the first entry of a struct commit_list	2015-12-11 11:14:13 -08:00
Junio C Hamano	782ca8c44e	Merge branch 'sn/null-pointer-arith-in-mark-tree-uninteresting' mark_tree_uninteresting() has code to handle the case where it gets passed a NULL pointer in its 'tree' parameter, but the function had 'object = &tree->object' assignment before checking if tree is NULL. This gives a compiler an excuse to declare that tree will never be NULL and apply a wrong optimization. Avoid it. * sn/null-pointer-arith-in-mark-tree-uninteresting: revision.c: fix possible null pointer arithmetic	2015-12-11 10:41:01 -08:00
Stefan Naewe	a2678df335	revision.c: fix possible null pointer arithmetic mark_tree_uninteresting() dereferences a tree pointer before checking if the pointer is valid. Fix that by doing the check first. Signed-off-by: Stefan Naewe <stefan.naewe@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-12-07 12:32:02 -08:00
brian m. carlson	ed1c9977cb	Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	f2fd0760f6	Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	7999b2cf77	Add several uses of get_object_hash. Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
Junio C Hamano	0692a6c22c	Merge branch 'rs/pop-commit' Code simplification. * rs/pop-commit: use pop_commit() for consuming the first entry of a struct commit_list	2015-10-30 13:07:03 -07:00
René Scharfe	e510ab8988	use pop_commit() for consuming the first entry of a struct commit_list Instead of open-coding the function pop_commit() just call it. This makes the intent clearer and reduces code size. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-26 14:06:46 -07:00
Jeff King	34fa79a6cd	prefer memcpy to strcpy When we already know the length of a string (e.g., because we just malloc'd to fit it), it's nicer to use memcpy than strcpy, as it makes it more obvious that we are not going to overflow the buffer (because the size we pass matches the size in the allocation). This also eliminates calls to strcpy, which make auditing the code base harder. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-05 11:08:05 -07:00
Junio C Hamano	5af77d1352	Merge branch 'jk/log-missing-default-HEAD' into maint "git init empty && git -C empty log" said "bad default revision 'HEAD'", which was found to be a bit confusing to new users. * jk/log-missing-default-HEAD: log: diagnose empty HEAD more clearly	2015-09-03 19:18:04 -07:00
Junio C Hamano	699a0f3748	Merge branch 'jk/log-missing-default-HEAD' "git init empty && git -C empty log" said "bad default revision 'HEAD'", which was found to be a bit confusing to new users. * jk/log-missing-default-HEAD: log: diagnose empty HEAD more clearly	2015-09-02 12:50:10 -07:00
Jeff King	ce11360467	log: diagnose empty HEAD more clearly If you init or clone an empty repository, the initial message from running "git log" is not very friendly: $ git init Initialized empty Git repository in /home/peff/foo/.git/ $ git log fatal: bad default revision 'HEAD' Let's detect this situation and write a more friendly message: $ git log fatal: your current branch 'master' does not have any commits yet We also detect the case that 'HEAD' points to a broken ref; this should be even less common, but is easy to see. Note that we do not diagnose all possible cases. We rely on resolve_ref, which means we do not get information about complex cases. E.g., "--default master" would use dwim_ref to find "refs/heads/master", but we notice only that "master" does not exist. Similarly, a complex sha1 expression like "--default HEAD^2" will not resolve as a ref. But that's OK. We fall back to a generic error message in those cases, and they are unlikely to be used anyway. Catching an empty or broken "HEAD" improves the common case, and the other cases are not regressed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-31 09:34:20 -07:00
Junio C Hamano	71cc60070f	Merge branch 'ad/bisect-cleanup' Code and documentation clean-up to "git bisect". * ad/bisect-cleanup: bisect: don't mix option parsing and non-trivial code bisect: simplify the addition of new bisect terms bisect: replace hardcoded "bad\|good" by variables Documentation/bisect: revise overall content Documentation/bisect: move getting help section to the end bisect: correction of typo	2015-08-12 14:09:53 -07:00
Antoine Delaite	cb46d630ba	bisect: simplify the addition of new bisect terms We create a file BISECT_TERMS in the repository .git to be read during a bisection. There's no user-interface yet, but "git bisect" works if terms other than old/new or bad/good are set in .git/BISECT_TERMS. The fonctions to be changed if we add new terms are quite few. In git-bisect.sh: check_and_set_terms bisect_voc Co-authored-by: Louis Stuber <stuberl@ensimag.grenoble-inp.fr> Tweaked-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Antoine Delaite <antoine.delaite@ensimag.grenoble-inp.fr> Signed-off-by: Louis Stuber <stuberl@ensimag.grenoble-inp.fr> Signed-off-by: Valentin Duperray <Valentin.Duperray@ensimag.imag.fr> Signed-off-by: Franck Jonas <Franck.Jonas@ensimag.imag.fr> Signed-off-by: Lucien Kong <Lucien.Kong@ensimag.imag.fr> Signed-off-by: Thomas Nguy <Thomas.Nguy@ensimag.imag.fr> Signed-off-by: Huynh Khoi Nguyen Nguyen <Huynh-Khoi-Nguyen.Nguyen@ensimag.imag.fr> Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-03 11:42:41 -07:00
Junio C Hamano	d939af12bd	Merge branch 'jk/date-mode-format' Teach "git log" and friends a new "--date=format:..." option to format timestamps using system's strftime(3). * jk/date-mode-format: strbuf: make strbuf_addftime more robust introduce "format" date-mode convert "enum date_mode" into a struct show-branch: use DATE_RELATIVE instead of magic number	2015-08-03 11:01:27 -07:00
Junio C Hamano	fbdeabf1f0	Merge branch 'jk/still-interesting' Code clean-up. * jk/still-interesting: revision.c: remove unneeded check for NULL	2015-07-17 10:44:56 -07:00
Jeff King	a5481a6c94	convert "enum date_mode" into a struct In preparation for adding date modes that may carry extra information beyond the mode itself, this patch converts the date_mode enum into a struct. Most of the conversion is fairly straightforward; we pass the struct as a pointer and dereference the type field where necessary. Locations that declare a date_mode can use a "{}" constructor. However, the tricky case is where we use the enum labels as constants, like: show_date(t, tz, DATE_NORMAL); Ideally we could say: show_date(t, tz, &{ DATE_NORMAL }); but of course C does not allow that. Likewise, we cannot cast the constant to a struct, because we need to pass an actual address. Our options are basically: 1. Manually add a "struct date_mode d = { DATE_NORMAL }" definition to each caller, and pass "&d". This makes the callers uglier, because they sometimes do not even have their own scope (e.g., they are inside a switch statement). 2. Provide a pre-made global "date_normal" struct that can be passed by address. We'd also need "date_rfc2822", "date_iso8601", and so forth. But at least the ugliness is defined in one place. 3. Provide a wrapper that generates the correct struct on the fly. The big downside is that we end up pointing to a single global, which makes our wrapper non-reentrant. But show_date is already not reentrant, so it does not matter. This patch implements 3, along with a minor macro to keep the size of the callers sane. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-29 11:39:07 -07:00
Stefan Beller	ae40ebda9b	revision.c: remove unneeded check for NULL The function is called only from one place, which makes sure to have `interesting_cache` not NULL. Additionally the variable is a dereferenced a few lines before unconditionally, which would have resulted in a segmentation fault before hitting this check. Signed-off-by: Stefan Beller <sbeller@google.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-29 09:54:18 -07:00
Junio C Hamano	c53312583b	Merge branch 'jk/squelch-missing-link-warning-for-unreachable' into maint Recent "git prune" traverses young unreachable objects to safekeep old objects in the reachability chain from them, which sometimes caused error messages that are unnecessarily alarming. * jk/squelch-missing-link-warning-for-unreachable: suppress errors on missing UNINTERESTING links silence broken link warnings with revs->ignore_missing_links add quieter versions of parse_{tree,commit}	2015-06-25 11:02:10 -07:00
Junio C Hamano	43262d8d65	Merge branch 'jk/squelch-missing-link-warning-for-unreachable' Recent "git prune" traverses young unreachable objects to safekeep old objects in the reachability chain from them, which sometimes caused error messages that are unnecessarily alarming. * jk/squelch-missing-link-warning-for-unreachable: suppress errors on missing UNINTERESTING links silence broken link warnings with revs->ignore_missing_links add quieter versions of parse_{tree,commit}	2015-06-11 09:29:59 -07:00
Jeff King	ce4e7b2ac3	suppress errors on missing UNINTERESTING links When we are traversing commit parents along the UNINTERESTING side of a revision walk, we do not care if the parent turns out to be missing. That lets us limit traversals using unreachable and possibly incomplete sections of history. However, we do still print error messages about the missing commits; this patch suppresses the error, as well. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-01 09:29:51 -07:00
Jeff King	daf7d86783	silence broken link warnings with revs->ignore_missing_links We set revs->ignore_missing_links to instruct the revision-walking machinery that we know the history graph may be incomplete. For example, we use it when walking unreachable but recent objects; we want to add what we can, but it's OK if the history is incomplete. However, we still print error messages for the missing objects, which can be confusing. This is not an error, but just a normal situation when transitioning from a repository last pruned by an older git (which can leave broken segments of history) to a more recent one (where we try to preserve whole reachable segments). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-01 09:29:50 -07:00
Junio C Hamano	df08eb357d	Merge branch 'jk/still-interesting' into maint "git rev-list --objects $old --not --all" to see if everything that is reachable from $old is already connected to the existing refs was very inefficient. * jk/still-interesting: limit_list: avoid quadratic behavior from still_interesting	2015-05-26 13:49:26 -07:00
Michael Haggerty	a89caf4bd4	handle_one_reflog(): rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:35 -07:00
Michael Haggerty	a217dcbd1e	handle_one_ref(): rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:28 -07:00
Michael Haggerty	2b2a5be394	each_ref_fn: change to take an object_id parameter Change typedef each_ref_fn to take a "const struct object_id oid" parameter instead of "const unsigned char sha1". To aid this transition, implement an adapter that can be used to wrap old-style functions matching the old typedef, which is now called "each_ref_sha1_fn"), and make such functions callable via the new interface. This requires the old function and its cb_data to be wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be used as the cb_data for an adapter function, each_ref_fn_adapter(). This is an enormous diff, but most of it consists of simple, mechanical changes to the sites that call any of the "for_each_ref" family of functions. Subsequent to this change, the call sites can be rewritten one by one to use the new interface. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:27 -07:00
Junio C Hamano	84e55dcb34	Merge branch 'jk/still-interesting' "git rev-list --objects $old --not --all" to see if everything that is reachable from $old is already connected to the existing refs was very inefficient. * jk/still-interesting: limit_list: avoid quadratic behavior from still_interesting	2015-05-11 14:23:43 -07:00
Jeff King	b6e8a3b540	limit_list: avoid quadratic behavior from still_interesting When we are limiting a rev-list traversal due to UNINTERESTING refs, we have to walk down the tips (both interesting and uninteresting) to find where they intersect. We keep a queue of commits to examine, pop commits off the queue one by one, and potentially add their parents. The size of the queue will naturally fluctuate based on the "width" of the history graph; i.e., the number of simultaneous lines of development. But for the most part it will stay in the same ballpark as the initial number of tips we fed, shrinking over time (as we hit common ancestors of the tips). So roughly speaking, if we start with `N` tips, we'll spend much of the time with a queue around `N` items. For each UNINTERESTING commit we pop, we call still_interesting to check whether marking its parents as UNINTERESTING has made the whole queue uninteresting (in which case we can quit early). Because the queue is stored as a linked list, this is `O(N)`, where `N` is the number of items in the queue. So processing a queue with `N` commits marked UNINTERESTING (and one or more interesting commits) will take `O(N^2)`. If you feed a lot of positive tips, this isn't a problem. They aren't UNINTERESTING, so they don't incur the still_interesting check. It also isn't a problem if you traverse from an interesting tip to some UNINTERESTING bases. We order the queue by recency, so the interesting commits stay at the front of the queue as we walk down them. The linear check can exit early as soon as it sees one interesting commit left in the queue. But if you want to know whether an older commit is reachable from a set of newer tips, we end up processing in the opposite direction: from the UNINTERESTING ones down to the interesting one. This may happen when we call: git rev-list $commits --not --all in check_everything_connected after a fetch. If we fetched something much older than most of our refs, and if we have a large number of refs, the traversal cost is dominated by the quadratic behavior. These commands simulate the connectivity check of such a fetch, when you have `$n` distinct refs in the receiver: # positive ref is 100,000 commits deep git rev-list --all \| head -100000 \| tail -1 >input # huge number of more recent negative refs git rev-list --all \| head -$n \| sed s/^/^/ >>input time git rev-list --stdin <input Here are timings for various `n` on the linux.git repository. The `n=1` case provides a baseline for just walking the commits, which lets us see the still_interesting overhead. The times marked with `+` subtract that baseline to show just the extra time growth due to the large number of refs. The `x` numbers show the slowdown of the adjusted time versus the prior trial. n \| before \| after -------------------------------------------------------- 1 \| 0.991s \| 0.848s 10000 \| 1.120s (+0.129s) \| 0.885s (+0.037s) 20000 \| 1.451s (+0.460s, 3.5x) \| 0.923s (+0.075s, 2.0x) 40000 \| 2.731s (+1.740s, 3.8x) \| 0.994s (+0.146s, 1.9x) 80000 \| 8.235s (+7.244s, 4.2x) \| 1.123s (+0.275s, 1.9x) Each trial doubles `n`, so you can see the quadratic (`4x`) behavior before this patch. Afterwards, we have a roughly linear relationship. The implementation is fairly straightforward. Whenever we do the linear search, we cache the interesting commit we find, and next time check it before doing another linear search. If that commit is removed from the list or becomes UNINTERESTING itself, then we fall back to the linear search. This is very similar to the trick used by `fce87ae` (Fix quadratic performance in rewrite_one., 2008-07-12). I considered and rejected several possible alternatives: 1. Keep a count of UNINTERESTING commits in the queue. This requires managing the count not only when removing an item from the queue, but also when marking an item as UNINTERESTING. That requires touching the other functions which mark commits, and would require knowing quickly which commits are in the queue (lookup in the queue is linear, so we would need an auxiliary structure or to also maintain an IN_QUEUE flag in each commit object). 2. Keep a separate list of interesting commits. Drop items from it when they are dropped from the queue, or if they become UNINTERESTING. This again suffers from extra complexity to maintain the list, not to mention CPU and memory. 3. Use a better data structure for the queue. This is something that could help the fix in `fce87ae`, because we order the queue by recency, and it is about inserting quickly in recency order. So a normal priority queue would help there. But here, we cannot disturb the order of the queue, which makes things harder. We really do need an auxiliary index to track the flag we care about, which is basically option (2) above. The "cache" trick is simple, and the numbers above show that it works well in practice. This is because the length of time it takes to find an interesting commit is proportional to the length of time it will remain cached (i.e., if we have to walk a long way to find it, it also means we have to pop a lot of elements in the queue until we get rid of it and have to find another interesting commit). The worst case is still quadratic, though. We could have `N` uninteresting commits at the front of the queue, followed by `N` interesting commits, where commit `i` has parent `i+N`. When we pop commit `i`, we will notice that the parent of the next commit, `i+1+N` is still interesting and cache it. But then handling commit `i+1`, we will mark its parent `i+1+N` uninteresting, and immediately invalidate our cache. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-04-17 15:22:05 -07:00
Junio C Hamano	dbd04eba01	Merge branch 'dj/log-graph-with-no-walk' "git log --graph --no-walk A B..." is a otcnflicting request that asks nonsense; no-walk tells us show discrete points in the history, while graph asks to draw connections between these discrete points. Forbid the combination. * dj/log-graph-with-no-walk: revision: forbid combining --graph and --no-walk	2015-03-25 12:54:22 -07:00

1 2 3 4 5 ...

758 Commits