git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	6b526ced6f	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-29 12:34:43 +09:00
Ævar Arnfjörð Bjarmason	7531a2dd87	log: add -P as a synonym for --perl-regexp Add a short -P option as a synonym for the longer --perl-regexp, for consistency with the options the corresponding grep invocations accept. This was intentionally omitted in commit `727b6fc3ed` ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified future use. Make it consistent with "grep" rather than to keep it open for future use, and to avoid the confusion of -P meaning different things for grep & log, as is the case with the -G option. As noted in the aforementioned commit the --basic-regexp option can't have a corresponding -G argument, as the log command already uses that for -G<regex>. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:59:05 +09:00
Jeff King	18f1ad7639	handle_revision_arg: record paths for pending objects If the revision parser sees an argument like tree:path, we parse it down to the correct blob (or tree), but throw away the "path" portion. Let's ask get_sha1_with_context() to record it, and pass it along in the pending array. This will let programs like git-diff which rely on the revision-parser show more accurate paths. Note that the implementation is a little tricky; we have to make sure we free oc.path in all code paths. For handle_dotdot(), we can piggy-back on the existing cleanup-wrapper pattern. The real work happens in handle_dotdot_1(), but the handle_dotdot() wrapper makes sure that the path is freed no matter how we exit the function (and for that reason we make sure that the object_context struct is zero'd, so if we fail to even get to the get_sha1_with_context() call, we just end up calling free(NULL)). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	101dd4de16	handle_revision_arg: record modes for "a..b" endpoints The "a..b" revision syntax was designed to handle commits, so it doesn't bother to record any mode we find while traversing a "tree:path" endpoint. These days "git diff" can diff blobs using either "a:path..b:path" (with dots) or "a:path b:path" (without), but the two behave inconsistently, as the with-dots version fails to notice the mode. Let's teach the dot-dot range parser to record modes; it doesn't cost us anything, and it makes this case work. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	62faad5aa5	handle_revision_arg: add handle_dotdot() helper The handle_revision_arg function is rather long, and a big chunk of it is handling the range operators. Let's pull that out to a separate helper. While we're doing so, we can clean up a few of the rough edges that made the flow hard to follow: - instead of manually restoring *dotdot (that we overwrote with a NUL), do the real work in a sub-helper, which makes it clear that the munge/restore lines are a matched pair - eliminate a goto which wasn't actually used for control flow, but only to avoid duplicating a few lines (instead, those lines are pushed into another helper function) - use early returns instead of deep nesting - consistently name all variables for the left-hand side of the range as "a" (rather than "this" or "from") and the right-hand side as "b" (rather than "next", or using the unadorned "sha1" or "flags" from the main function). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	d89797feff	handle_revision_arg: hoist ".." check out of range parsing Since `003c84f6d` (specifying ranges: we did not mean to make ".." an empty set, 2011-05-02), we treat the argument ".." specially. We detect it by noticing that both sides of the range are empty, and that this is a non-symmetric two-dot range. While correct, this makes the code overly complicated. We can just detect ".." up front before we try to do further parsing. This avoids having to de-munge the NUL from dotdot, and lets us eliminate an extra const array (which we needed only to do direct pointer comparisons). It also removes the one code path from the range-parsing conditional that requires us to return -1. That will make it simpler to pull the dotdot parsing out into its own function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	f632dedd8d	handle_revision_arg: stop using "dotdot" as a generic pointer The handle_revision_arg() function has a "dotdot" variable that it uses to find a ".." or "..." in the argument. If we don't find one, we look for other marks, like "^!". But we just keep re-using the "dotdot" variable, which is confusing. Let's introduce a separate "mark" variable that can be used for these other marks. They still reuse the same variable, but at least the name is no longer actively misleading. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	1d6c93817b	handle_revision_arg: simplify commit reference lookups The "dotdot" range parser avoids calling lookup_commit_reference() if we are directly fed two commits. But its casts are unnecessarily complex; that function will just return a commit we pass into it. Just calling the function all the time is much simpler, and doesn't do any significant extra work (the object is already parsed, and deref_tag() on a non-tag is a noop; we do incur one extra lookup_object() call, but that's fairly trivial). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:27 +09:00
Jeff King	ed79b2cf03	handle_revision_arg: reset "dotdot" consistently When we are parsing a range like "a..b", we write a temporary NUL over the first ".", so that we can access the names "a" and "b" as C strings. But our restoration of the original "." is done at inconsistent times, which can lead to confusing results. For most calls, we restore the "." after we resolve the names, but before we call verify_non_filename(). This means that when we later call add_pending_object(), the name for the left-hand "a" has been re-expanded to "a..b". You can see this with: git log --source a...b where "b" will be correctly marked with "b", but "a" will be marked with "a...b". Likewise with "a..b" (though you need to use --boundary to even see "a" at all in that case). To top off the confusion, when the REVARG_CANNOT_BE_FILENAME flag is set, we skip the non-filename check, and leave the NUL in place. That means we do report the correct name for "a" in the pending array. But some code paths try to show the whole "a...b" name in error messages, and these erroneously show only "a" instead of "a...b". E.g.: $ git cherry-pick HEAD:foo...HEAD:foo error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit fatal: Invalid symmetric difference expression HEAD:foo (That last message should be "HEAD:foo...HEAD:foo"; I used cherry-pick because it passes the CANNOT_BE_FILENAME flag). As an interesting side note, cherry-pick actually looks at and re-resolves the arguments from the pending->name fields. So it would have been visibly broken by the first bug, but the effect was canceled out by the second one. This patch makes the whole function consistent by re-writing the NUL immediately after calling verify_non_filename(), and then restoring the "." as appropriate in some error-printing and early-return code paths. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-24 10:59:03 +09:00
Junio C Hamano	ca7b2ab07d	Merge branch 'bc/object-id' * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-23 14:29:19 +09:00
Ævar Arnfjörð Bjarmason	9e3cbc59d5	log: make --regexp-ignore-case work with --perl-regexp Make the --regexp-ignore-case option work with --perl-regexp. This never worked, and there was no test for this. Fix the bug and add a test. When PCRE support was added in commit `63e7e9d8b6` ("git-grep: Learn PCRE", 2011-05-09) compile_pcre_regexp() would only check opt->ignore_case, but when the --perl-regexp option was added in commit `727b6fc3ed` ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03) the code didn't set the opt->ignore_case. Change the test suite to test for -i and --invert-regexp with basic/extended/perl patterns in addition to fixed, which was the only patternType that was tested for before in combination with those options. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-21 08:25:37 +09:00
Jeff King	a3ba6bf10a	revision.c: ignore broken tags with ignore_missing_links When peeling a tag for prepare_revision_walk(), we do not respect the ignore_missing_links flag. This can lead to a bogus error when pack-objects walks the possibly-broken unreachable-but-recent part of the object graph. The other link-following all happens via traverse_commit_list(), which explains why this case was missed. And our tests covered only broken links from commits. Let's be more comprehensive and cover broken tree entries (which do work) and tags (which shows off this bug). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-20 18:32:58 +09:00
brian m. carlson	c251c83df2	object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	654b9a905c	revision: convert remaining parse_object callers to object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	a58a1b01ff	revision: rename add_pending_sha1 to add_pending_oid Rename this function and convert it to take a pointer to struct object_id. This is a prerequisite for converting get_reference, which is needed to convert parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	740ee055c6	Convert lookup_tree to struct object_id Convert the lookup_tree function to take a pointer to struct object_id. The commit was created with manual changes to tree.c, tree.h, and object.c, plus the following semantic patch: @@ @@ - lookup_tree(EMPTY_TREE_SHA1_BIN) + lookup_tree(&empty_tree_oid) @@ expression E1; @@ - lookup_tree(E1.hash) + lookup_tree(&E1) @@ expression E1; @@ - lookup_tree(E1->hash) + lookup_tree(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	3aca1fc6c9	Convert lookup_blob to struct object_id Convert lookup_blob to take a pointer to struct object_id. The commit was created with manual changes to blob.c and blob.h, plus the following semantic patch: @@ expression E1; @@ - lookup_blob(E1.hash) + lookup_blob(&E1) @@ expression E1; @@ - lookup_blob(E1->hash) + lookup_blob(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	bc83266abe	Convert lookup_commit* to struct object_id Convert lookup_commit, lookup_commit_or_die, lookup_commit_reference, and lookup_commit_reference_gently to take struct object_id arguments. Introduce a temporary in parse_object buffer in order to convert this function. This is required since in order to convert parse_object and parse_object_buffer, lookup_commit_reference_gently and lookup_commit_or_die would need to be converted. Not introducing a temporary would therefore require that lookup_commit_or_die take a struct object_id , but lookup_commit would take unsigned char , leaving a confusing and hard-to-use interface. parse_object_buffer will lose this temporary in a later patch. This commit was created with manual changes to commit.c, commit.h, and object.c, plus the following semantic patch: @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1.hash, E2) + lookup_commit_reference_gently(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1->hash, E2) + lookup_commit_reference_gently(E1, E2) @@ expression E1; @@ - lookup_commit_reference(E1.hash) + lookup_commit_reference(&E1) @@ expression E1; @@ - lookup_commit_reference(E1->hash) + lookup_commit_reference(E1) @@ expression E1; @@ - lookup_commit(E1.hash) + lookup_commit(&E1) @@ expression E1; @@ - lookup_commit(E1->hash) + lookup_commit(E1) @@ expression E1, E2; @@ - lookup_commit_or_die(E1.hash, E2) + lookup_commit_or_die(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_or_die(E1->hash, E2) + lookup_commit_or_die(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	68ab61dd09	revision: convert prepare_show_merge to struct object_id This is a caller of lookup_commit_or_die, which we will convert later on. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	e0a9280404	Convert struct cache_tree to use struct object_id Convert the sha1 member of struct cache_tree to struct object_id by changing the definition and applying the following semantic patch, plus the standard object_id transforms: @@ struct cache_tree E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_tree *E1; @@ - E1->sha1 + E1->oid.hash Fix up one reference to active_cache_tree which was not automatically caught by Coccinelle. These changes are prerequisites for converting parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-02 10:46:41 +09:00
Johannes Schindelin	dddbad728c	timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-27 13:07:39 +09:00
Junio C Hamano	e1fae93019	Merge branch 'bc/object-id' "uchar [40]" to "struct object_id" conversion continues. * bc/object-id: wt-status: convert to struct object_id builtin/merge-base: convert to struct object_id Convert object iteration callbacks to struct object_id sha1_file: introduce an nth_packed_object_oid function refs: simplify parsing of reflog entries refs: convert each_reflog_ent_fn to struct object_id reflog-walk: convert struct reflog_info to struct object_id builtin/replace: convert to struct object_id Convert remaining callers of resolve_refdup to object_id builtin/merge: convert to struct object_id builtin/clone: convert to struct object_id builtin/branch: convert to struct object_id builtin/grep: convert to struct object_id builtin/fmt-merge-message: convert to struct object_id builtin/fast-export: convert to struct object_id builtin/describe: convert to struct object_id builtin/diff-tree: convert to struct object_id builtin/commit: convert to struct object_id hex: introduce parse_oid_hex	2017-03-17 13:50:25 -07:00
Jeff King	0e9f62dab9	interpret_branch_name: allow callers to restrict expansions The interpret_branch_name() function converts names like @{-1} and @{upstream} into branch names. The expanded ref names are not fully qualified, and may be outside of the refs/heads/ namespace (e.g., "@" expands to "HEAD", and "@{upstream}" is likely to be in "refs/remotes/"). This is OK for callers like dwim_ref() which are primarily interested in resolving the resulting name, no matter where it is. But callers like "git branch" treat the result as a branch name in refs/heads/. When we expand to a ref outside that namespace, the results are very confusing (e.g., "git branch @" tries to create refs/heads/HEAD, which is nonsense). Callers can't know from the returned string how the expansion happened (e.g., did the user really ask for a branch named "HEAD", or did we do a bogus expansion?). One fix would be to return some out-parameters describing the types of expansion that occurred. This has the benefit that the caller can generate precise error messages ("I understood @{upstream} to mean origin/master, but that is a remote tracking branch, so you cannot create it as a local name"). However, out-parameters make the function interface somewhat cumbersome. Instead, let's do the opposite: let the caller tell us which elements to expand. That's easier to pass in, and none of the callers give more precise error messages than "@{upstream} isn't a valid branch name" anyway (which should be sufficient). The strbuf_branchname() function needs a similar parameter, as most of the callers access interpret_branch_name() through it. We can break the callers down into two groups: 1. Callers that are happy with any kind of ref in the result. We pass "0" here, so they continue to work without restrictions. This includes merge_name(), the reflog handling in add_pending_object_with_path(), and substitute_branch_name(). This last is what powers dwim_ref(). 2. Callers that have funny corner cases (mostly in git-branch and git-checkout). These need to make use of the new parameter, but I've left them as "0" in this patch, and will address them individually in follow-on patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:05:04 -08:00
brian m. carlson	9461d27240	refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-22 10:12:15 -08:00
Junio C Hamano	8c98a68981	Merge branch 'vn/revision-shorthand-for-side-branch-log' "git log rev^..rev" is an often-used revision range specification to show what was done on a side branch merged at rev. This has gained a short-hand "rev^-1". In general "rev^-$n" is the same as "^rev^$n rev", i.e. what has happened on other branches while the history leading to nth parent was looking the other way. * vn/revision-shorthand-for-side-branch-log: revision: new rev^-n shorthand for rev^n..rev	2016-10-06 14:53:10 -07:00
Vegard Nossum	8779351dd7	revision: new rev^-n shorthand for rev^n..rev "git log rev^..rev" is commonly used to show all work done on and merged from a side branch. This patch introduces a shorthand "rev^-" for this and additionally allows "rev^-$n" to mean "reachable from rev, excluding what is reachable from the nth parent of rev". For example, for a two-parent merge, you can use rev^-2 to get the set of commits which were made to the main branch while the topic branch was prepared. Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-27 10:59:28 -07:00
brian m. carlson	99d1a9861a	cache: convert struct cache_entry to use struct object_id Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
Junio C Hamano	dd610aeda6	Merge branch 'kw/patch-ids-optim' When "git rebase" tries to compare set of changes on the updated upstream and our own branch, it computes patch-id for all of these changes and attempts to find matches. This has been optimized by lazily computing the full patch-id (which is expensive) to be compared only for changes that touch the same set of paths. * kw/patch-ids-optim: rebase: avoid computing unnecessary patch IDs patch-ids: add flag to create the diff patch id using header only data patch-ids: replace the seen indicator with a commit pointer patch-ids: stop using a hand-rolled hashmap implementation	2016-08-12 09:47:39 -07:00
Junio C Hamano	b422d99658	Merge branch 'jc/grep-commandline-vs-configuration' "git -c grep.patternType=extended log --basic-regexp" misbehaved because the internal API to access the grep machinery was not designed well. * jc/grep-commandline-vs-configuration: grep: further simplify setting the pattern type	2016-08-04 14:39:18 -07:00
Kevin Willford	683f17ec44	patch-ids: replace the seen indicator with a commit pointer The cherry_pick_list was looping through the original side checking the seen indicator and setting the cherry_flag on the commit. If we save off the commit in the patch_id we can set the cherry_flag on the correct commit when running through the other side when a patch_id match is found. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 13:23:03 -07:00
Junio C Hamano	8465541e8c	grep: further simplify setting the pattern type When `c5c31d33` (grep: move pattern-type bits support to top-level grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper function, the intention was to allow the users of grep API to having to fiddle only with .pattern_type_option (which can be set to "fixed", "basic", "extended", and "pcre"), and then immediately before compiling the pattern strings for use, call grep_commit_pattern_type() to have it prepare various bits in the grep_opt structure (like .fixed, .regflags, etc.). However, grep_set_pattern_type_option() helper function the grep API internally uses were left as an external function by mistake. This function shouldn't have been made callable by the users of the API. Later when the grep API was used in revision traversal machinery, the caller then mistakenly started calling the function around `34a4ae55` (log --grep: use the same helper to set -E/-F options as "git grep", 2012-10-03), instead of setting the .pattern_type_option field and letting the grep_commit_pattern_type() to take care of the details. This caused an unnecessary bug that made a configured grep.patternType take precedence over the command line options (e.g. --basic-regexp, --fixed-strings) in "git log" family of commands. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-25 09:16:18 -07:00
Junio C Hamano	369dc4081c	Merge branch 'mj/log-show-signature-conf' "git log" learns log.showSignature configuration variable, and a command line option "--no-show-signature" to countermand it. * mj/log-show-signature-conf: log: add log.showSignature configuration variable log: add "--no-show-signature" command line option t4202: refactor test	2016-07-11 10:31:08 -07:00
Mehul Jain	aa3799996c	log: add "--no-show-signature" command line option If an user creates an alias with "--show-signature" early in command line, e.g. [alias] logss = log --show-signature then there is no way to countermand it through command line. Teach git-log and related commands about "--no-show-signature" command line option. This will make "git logss --no-show-signature" run without showing GPG signature. Signed-off-by: Mehul Jain <mehul.jain2029@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-24 13:01:13 -07:00
Junio C Hamano	ed6e8038f9	pathspec: rename free_pathspec() to clear_pathspec() The function takes a pointer to a pathspec structure, and releases the resources held by it, but does not free() the structure itself. Such a function should be called "clear", not "free". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-02 14:09:22 -07:00
Junio C Hamano	8429f2b42d	Merge branch 'bc/object-id' Move from unsigned char[20] to struct object_id continues. * bc/object-id: match-trees: convert several leaf functions to use struct object_id tree-walk: convert tree_entry_extract() to use struct object_id struct name_entry: use struct object_id instead of unsigned char sha1[20] match-trees: convert shift_tree() and shift_tree_by() to use object_id test-match-trees: convert to use struct object_id sha1-name: introduce a get_oid() function	2016-05-06 14:45:44 -07:00
brian m. carlson	7d924c9139	struct name_entry: use struct object_id instead of unsigned char sha1[20] Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-25 14:23:42 -07:00
Junio C Hamano	cafef3d7ad	Merge branch 'lt/pretty-expand-tabs' When "git log" shows the log message indented by 4-spaces, the remainder of a line after a HT does not align in the way the author originally intended. The command now expands tabs by default in such a case, and allows the users to override it with a new option, '--no-expand-tabs'. * lt/pretty-expand-tabs: pretty: test --expand-tabs pretty: allow tweaking tabwidth in --expand-tabs pretty: enable --expand-tabs by default for selected pretty formats pretty: expand tabs in indented logs to make things line up properly	2016-04-13 14:12:36 -07:00
Junio C Hamano	fe37a9c586	pretty: allow tweaking tabwidth in --expand-tabs When the local convention of the project is to use tab width that is not 8, it may make sense to allow "git log --expand-tabs=<n>" to tweak the output to match it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-30 12:52:26 -07:00
Junio C Hamano	0893eec85f	pretty: enable --expand-tabs by default for selected pretty formats "git log --pretty={medium,full,fuller}" and "git log" by default prepend 4 spaces to the log message, so it makes sense to enable the new "expand-tabs" facility by default for these formats. Add --no-expand-tabs option to override the new default. The change alone breaks a test in t4201 that runs "git shortlog" on the output from "git log", and expects that the output from "git log" does not do such a tab expansion. Adjust the test to explicitly disable expand-tabs with --no-expand-tabs. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-30 12:39:29 -07:00
Linus Torvalds	7cc13c717b	pretty: expand tabs in indented logs to make things line up properly A commit log message sometimes tries to line things up using tabs, assuming fixed-width font with the standard 8-place tab settings. Viewing such a commit however does not work well in "git log", as we indent the lines by prefixing 4 spaces in front of them. This should all line up: Column 1 Column 2 -------- -------- A B ABCD EFGH SPACES Instead of Tabs Even with multi-byte UTF8 characters: Column 1 Column 2 -------- -------- Ä B åäö 100 A Møøse once bit my sister.. Tab-expand the lines in "git log --expand-tabs" output before prefixing 4 spaces. This is based on the patch by Linus Torvalds, but at this step, we require an explicit command line option to enable the behaviour. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-30 11:25:35 -07:00
Jeff King	2824e1841b	list-objects: pass full pathname to callbacks When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:04 -07:00
Jeff King	dc06dc8800	list-objects: drop name_path entirely In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:03 -07:00
Jeff King	f3badaed51	list-objects: convert name_path to a strbuf The "struct name_path" data is examined in only two places: we generate it in process_tree(), and we convert it to a single string in path_name(). Everyone else just passes it through to those functions. We can further note that process_tree() already keeps a single strbuf with the leading tree path, for use with tree_entry_interesting(). Instead of building a separate name_path linked list, let's just use the one we already build in "base". This reduces the amount of code (especially tricky code in path_name() which did not check for integer overflows caused by deep or large pathnames). It is also more efficient in some instances. Any time we were using tree_entry_interesting, we were building up the strbuf anyway, so this is an immediate and obvious win there. In cases where we were not, we trade off storing "pathname/" in a strbuf on the heap for each level of the path, instead of two pointers and an int on the stack (with one pointer into the tree object). On a 64-bit system, the latter is 20 bytes; so if path components are less than that on average, this has lower peak memory usage. In practice it probably doesn't matter either way; we are already holding in memory all of the tree objects leading up to each pathname, and for normal-depth pathnames, we are only talking about hundreds of bytes. This patch leaves "struct name_path" as a thin wrapper around the strbuf, to avoid disrupting callbacks. We should fix them, but leaving it out makes this diff easier to view. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:03 -07:00
Jeff King	8eee9f9277	show_object_with_name: simplify by using path_name() When "git rev-list" shows an object with its associated path name, it does so by walking the name_path linked list and printing each component (stopping at any embedded NULs or newlines). We'd like to eventually get rid of name_path entirely in favor of a single buffer, and dropping this custom printing code is part of that. As a first step, let's use path_name() to format the list into a single buffer, and print that. This is strictly less efficient than the original, but it's a temporary step in the refactoring; our end game will be to get the fully formatted name in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:03 -07:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Jeff King	50a6c8efa2	use st_add and st_mult for allocation size computation If our size computation overflows size_t, we may allocate a much smaller buffer than we expected and overflow it. It's probably impossible to trigger an overflow in most of these sites in practice, but it is easy enough convert their additions and multiplications into overflow-checking variants. This may be fixing real bugs, and it makes auditing the code easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Jeff King	de1e67d070	list-objects: pass full pathname to callbacks When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:17 -08:00
Jeff King	bd64516aca	list-objects: drop name_path entirely In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:15 -08:00
Jeff King	13528ab37c	list-objects: convert name_path to a strbuf The "struct name_path" data is examined in only two places: we generate it in process_tree(), and we convert it to a single string in path_name(). Everyone else just passes it through to those functions. We can further note that process_tree() already keeps a single strbuf with the leading tree path, for use with tree_entry_interesting(). Instead of building a separate name_path linked list, let's just use the one we already build in "base". This reduces the amount of code (especially tricky code in path_name() which did not check for integer overflows caused by deep or large pathnames). It is also more efficient in some instances. Any time we were using tree_entry_interesting, we were building up the strbuf anyway, so this is an immediate and obvious win there. In cases where we were not, we trade off storing "pathname/" in a strbuf on the heap for each level of the path, instead of two pointers and an int on the stack (with one pointer into the tree object). On a 64-bit system, the latter is 20 bytes; so if path components are less than that on average, this has lower peak memory usage. In practice it probably doesn't matter either way; we are already holding in memory all of the tree objects leading up to each pathname, and for normal-depth pathnames, we are only talking about hundreds of bytes. This patch leaves "struct name_path" as a thin wrapper around the strbuf, to avoid disrupting callbacks. We should fix them, but leaving it out makes this diff easier to view. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:10 -08:00
Jeff King	f9fb9d0e3c	show_object_with_name: simplify by using path_name() When "git rev-list" shows an object with its associated path name, it does so by walking the name_path linked list and printing each component (stopping at any embedded NULs or newlines). We'd like to eventually get rid of name_path entirely in favor of a single buffer, and dropping this custom printing code is part of that. As a first step, let's use path_name() to format the list into a single buffer, and print that. This is strictly less efficient than the original, but it's a temporary step in the refactoring; our end game will be to get the fully formatted name in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:08 -08:00
Junio C Hamano	02dab5d399	Merge branch 'nd/diff-with-path-params' into maint A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-05 14:54:15 -08:00
Junio C Hamano	c167a96e68	Merge branch 'nd/diff-with-path-params' A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument	2016-02-03 14:16:04 -08:00
Duy Nguyen	a97262c62f	diff: make -O and --output work in subdirectory Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-21 10:45:13 -08:00
Junio C Hamano	6e8d46f9d4	revision: read --stdin with strbuf_getline() Reading with getwholeline() and manually stripping the terminating '\n' would leave CR at the end of the line if the input comes from a DOS editor. Constrasting this with the other changes around "--stdin" in this series, one may realize that the way "log" family of commands read the paths with "--stdin" looks inconsistent and sloppy. It does not allow us to C-quote a textual input, neither does it accept records that are NUL-terminated. These are unfortunately way too late to fix X-<. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-15 10:33:28 -08:00
Junio C Hamano	e3073cf895	Merge branch 'jk/pending-keep-tag-name' into maint History traversal with "git log --source" that starts with an annotated tag failed to report the tag as "source", due to an old regression in the command line parser back in v2.2 days. * jk/pending-keep-tag-name: revision.c: propagate tag names from pending array	2016-01-04 14:03:08 -08:00
Junio C Hamano	71957339da	Merge branch 'jk/pending-keep-tag-name' History traversal with "git log --source" that starts with an annotated tag failed to report the tag as "source", due to an old regression in the command line parser back in v2.2 days. * jk/pending-keep-tag-name: revision.c: propagate tag names from pending array	2015-12-28 13:58:04 -08:00
Jeff King	728350b76a	revision.c: propagate tag names from pending array When we unwrap a tag to find its commit for a traversal, we do not propagate the "name" field of the tag in the pending array (i.e., the ref name the user gave us in the first place) to the commit (instead, we use an empty string). This means that "git log --source" will never show the tag-name for commits we reach through it. This was broken in `2073949` (traverse_commit_list: support pending blobs/trees with paths, 2014-10-15). That commit tried to be careful and avoid propagating the path information for a tag (which would be nonsensical) to trees and blobs. But it should not have cut off the "name" field, which should carry forward to children. Note that this does mean that the "name" field will carry forward to blobs and trees, too. Whereas prior to `2073949`, we always gave them an empty string. This is the right thing to do, but in practice no callers probably use it (since now we have an explicit separate "path" field, which was the point of `2073949`). We add tests here not only for the broken case, but also a basic sanity test of "log --source" in general, which did not have any coverage in the test suite. Reported-by: Raymundo <gypark@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-12-17 10:47:56 -08:00
Junio C Hamano	58e3dd21f6	Merge branch 'sn/null-pointer-arith-in-mark-tree-uninteresting' into maint mark_tree_uninteresting() has code to handle the case where it gets passed a NULL pointer in its 'tree' parameter, but the function had 'object = &tree->object' assignment before checking if tree is NULL. This gives a compiler an excuse to declare that tree will never be NULL and apply a wrong optimization. Avoid it. * sn/null-pointer-arith-in-mark-tree-uninteresting: revision.c: fix possible null pointer arithmetic	2015-12-11 11:14:38 -08:00
Junio C Hamano	0af22d6fff	Merge branch 'rs/pop-commit' into maint Code simplification. * rs/pop-commit: use pop_commit() for consuming the first entry of a struct commit_list	2015-12-11 11:14:13 -08:00
Junio C Hamano	782ca8c44e	Merge branch 'sn/null-pointer-arith-in-mark-tree-uninteresting' mark_tree_uninteresting() has code to handle the case where it gets passed a NULL pointer in its 'tree' parameter, but the function had 'object = &tree->object' assignment before checking if tree is NULL. This gives a compiler an excuse to declare that tree will never be NULL and apply a wrong optimization. Avoid it. * sn/null-pointer-arith-in-mark-tree-uninteresting: revision.c: fix possible null pointer arithmetic	2015-12-11 10:41:01 -08:00
Stefan Naewe	a2678df335	revision.c: fix possible null pointer arithmetic mark_tree_uninteresting() dereferences a tree pointer before checking if the pointer is valid. Fix that by doing the check first. Signed-off-by: Stefan Naewe <stefan.naewe@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-12-07 12:32:02 -08:00
brian m. carlson	ed1c9977cb	Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	f2fd0760f6	Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	7999b2cf77	Add several uses of get_object_hash. Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
Junio C Hamano	0692a6c22c	Merge branch 'rs/pop-commit' Code simplification. * rs/pop-commit: use pop_commit() for consuming the first entry of a struct commit_list	2015-10-30 13:07:03 -07:00
René Scharfe	e510ab8988	use pop_commit() for consuming the first entry of a struct commit_list Instead of open-coding the function pop_commit() just call it. This makes the intent clearer and reduces code size. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-26 14:06:46 -07:00
Jeff King	34fa79a6cd	prefer memcpy to strcpy When we already know the length of a string (e.g., because we just malloc'd to fit it), it's nicer to use memcpy than strcpy, as it makes it more obvious that we are not going to overflow the buffer (because the size we pass matches the size in the allocation). This also eliminates calls to strcpy, which make auditing the code base harder. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-05 11:08:05 -07:00
Junio C Hamano	5af77d1352	Merge branch 'jk/log-missing-default-HEAD' into maint "git init empty && git -C empty log" said "bad default revision 'HEAD'", which was found to be a bit confusing to new users. * jk/log-missing-default-HEAD: log: diagnose empty HEAD more clearly	2015-09-03 19:18:04 -07:00
Junio C Hamano	699a0f3748	Merge branch 'jk/log-missing-default-HEAD' "git init empty && git -C empty log" said "bad default revision 'HEAD'", which was found to be a bit confusing to new users. * jk/log-missing-default-HEAD: log: diagnose empty HEAD more clearly	2015-09-02 12:50:10 -07:00
Jeff King	ce11360467	log: diagnose empty HEAD more clearly If you init or clone an empty repository, the initial message from running "git log" is not very friendly: $ git init Initialized empty Git repository in /home/peff/foo/.git/ $ git log fatal: bad default revision 'HEAD' Let's detect this situation and write a more friendly message: $ git log fatal: your current branch 'master' does not have any commits yet We also detect the case that 'HEAD' points to a broken ref; this should be even less common, but is easy to see. Note that we do not diagnose all possible cases. We rely on resolve_ref, which means we do not get information about complex cases. E.g., "--default master" would use dwim_ref to find "refs/heads/master", but we notice only that "master" does not exist. Similarly, a complex sha1 expression like "--default HEAD^2" will not resolve as a ref. But that's OK. We fall back to a generic error message in those cases, and they are unlikely to be used anyway. Catching an empty or broken "HEAD" improves the common case, and the other cases are not regressed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-31 09:34:20 -07:00
Junio C Hamano	71cc60070f	Merge branch 'ad/bisect-cleanup' Code and documentation clean-up to "git bisect". * ad/bisect-cleanup: bisect: don't mix option parsing and non-trivial code bisect: simplify the addition of new bisect terms bisect: replace hardcoded "bad\|good" by variables Documentation/bisect: revise overall content Documentation/bisect: move getting help section to the end bisect: correction of typo	2015-08-12 14:09:53 -07:00
Antoine Delaite	cb46d630ba	bisect: simplify the addition of new bisect terms We create a file BISECT_TERMS in the repository .git to be read during a bisection. There's no user-interface yet, but "git bisect" works if terms other than old/new or bad/good are set in .git/BISECT_TERMS. The fonctions to be changed if we add new terms are quite few. In git-bisect.sh: check_and_set_terms bisect_voc Co-authored-by: Louis Stuber <stuberl@ensimag.grenoble-inp.fr> Tweaked-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Antoine Delaite <antoine.delaite@ensimag.grenoble-inp.fr> Signed-off-by: Louis Stuber <stuberl@ensimag.grenoble-inp.fr> Signed-off-by: Valentin Duperray <Valentin.Duperray@ensimag.imag.fr> Signed-off-by: Franck Jonas <Franck.Jonas@ensimag.imag.fr> Signed-off-by: Lucien Kong <Lucien.Kong@ensimag.imag.fr> Signed-off-by: Thomas Nguy <Thomas.Nguy@ensimag.imag.fr> Signed-off-by: Huynh Khoi Nguyen Nguyen <Huynh-Khoi-Nguyen.Nguyen@ensimag.imag.fr> Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-03 11:42:41 -07:00
Junio C Hamano	d939af12bd	Merge branch 'jk/date-mode-format' Teach "git log" and friends a new "--date=format:..." option to format timestamps using system's strftime(3). * jk/date-mode-format: strbuf: make strbuf_addftime more robust introduce "format" date-mode convert "enum date_mode" into a struct show-branch: use DATE_RELATIVE instead of magic number	2015-08-03 11:01:27 -07:00
Junio C Hamano	fbdeabf1f0	Merge branch 'jk/still-interesting' Code clean-up. * jk/still-interesting: revision.c: remove unneeded check for NULL	2015-07-17 10:44:56 -07:00
Jeff King	a5481a6c94	convert "enum date_mode" into a struct In preparation for adding date modes that may carry extra information beyond the mode itself, this patch converts the date_mode enum into a struct. Most of the conversion is fairly straightforward; we pass the struct as a pointer and dereference the type field where necessary. Locations that declare a date_mode can use a "{}" constructor. However, the tricky case is where we use the enum labels as constants, like: show_date(t, tz, DATE_NORMAL); Ideally we could say: show_date(t, tz, &{ DATE_NORMAL }); but of course C does not allow that. Likewise, we cannot cast the constant to a struct, because we need to pass an actual address. Our options are basically: 1. Manually add a "struct date_mode d = { DATE_NORMAL }" definition to each caller, and pass "&d". This makes the callers uglier, because they sometimes do not even have their own scope (e.g., they are inside a switch statement). 2. Provide a pre-made global "date_normal" struct that can be passed by address. We'd also need "date_rfc2822", "date_iso8601", and so forth. But at least the ugliness is defined in one place. 3. Provide a wrapper that generates the correct struct on the fly. The big downside is that we end up pointing to a single global, which makes our wrapper non-reentrant. But show_date is already not reentrant, so it does not matter. This patch implements 3, along with a minor macro to keep the size of the callers sane. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-29 11:39:07 -07:00
Stefan Beller	ae40ebda9b	revision.c: remove unneeded check for NULL The function is called only from one place, which makes sure to have `interesting_cache` not NULL. Additionally the variable is a dereferenced a few lines before unconditionally, which would have resulted in a segmentation fault before hitting this check. Signed-off-by: Stefan Beller <sbeller@google.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-29 09:54:18 -07:00
Junio C Hamano	c53312583b	Merge branch 'jk/squelch-missing-link-warning-for-unreachable' into maint Recent "git prune" traverses young unreachable objects to safekeep old objects in the reachability chain from them, which sometimes caused error messages that are unnecessarily alarming. * jk/squelch-missing-link-warning-for-unreachable: suppress errors on missing UNINTERESTING links silence broken link warnings with revs->ignore_missing_links add quieter versions of parse_{tree,commit}	2015-06-25 11:02:10 -07:00
Junio C Hamano	43262d8d65	Merge branch 'jk/squelch-missing-link-warning-for-unreachable' Recent "git prune" traverses young unreachable objects to safekeep old objects in the reachability chain from them, which sometimes caused error messages that are unnecessarily alarming. * jk/squelch-missing-link-warning-for-unreachable: suppress errors on missing UNINTERESTING links silence broken link warnings with revs->ignore_missing_links add quieter versions of parse_{tree,commit}	2015-06-11 09:29:59 -07:00
Jeff King	ce4e7b2ac3	suppress errors on missing UNINTERESTING links When we are traversing commit parents along the UNINTERESTING side of a revision walk, we do not care if the parent turns out to be missing. That lets us limit traversals using unreachable and possibly incomplete sections of history. However, we do still print error messages about the missing commits; this patch suppresses the error, as well. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-01 09:29:51 -07:00
Jeff King	daf7d86783	silence broken link warnings with revs->ignore_missing_links We set revs->ignore_missing_links to instruct the revision-walking machinery that we know the history graph may be incomplete. For example, we use it when walking unreachable but recent objects; we want to add what we can, but it's OK if the history is incomplete. However, we still print error messages for the missing objects, which can be confusing. This is not an error, but just a normal situation when transitioning from a repository last pruned by an older git (which can leave broken segments of history) to a more recent one (where we try to preserve whole reachable segments). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-01 09:29:50 -07:00
Junio C Hamano	df08eb357d	Merge branch 'jk/still-interesting' into maint "git rev-list --objects $old --not --all" to see if everything that is reachable from $old is already connected to the existing refs was very inefficient. * jk/still-interesting: limit_list: avoid quadratic behavior from still_interesting	2015-05-26 13:49:26 -07:00
Michael Haggerty	a89caf4bd4	handle_one_reflog(): rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:35 -07:00
Michael Haggerty	a217dcbd1e	handle_one_ref(): rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:28 -07:00
Michael Haggerty	2b2a5be394	each_ref_fn: change to take an object_id parameter Change typedef each_ref_fn to take a "const struct object_id oid" parameter instead of "const unsigned char sha1". To aid this transition, implement an adapter that can be used to wrap old-style functions matching the old typedef, which is now called "each_ref_sha1_fn"), and make such functions callable via the new interface. This requires the old function and its cb_data to be wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be used as the cb_data for an adapter function, each_ref_fn_adapter(). This is an enormous diff, but most of it consists of simple, mechanical changes to the sites that call any of the "for_each_ref" family of functions. Subsequent to this change, the call sites can be rewritten one by one to use the new interface. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:27 -07:00
Junio C Hamano	84e55dcb34	Merge branch 'jk/still-interesting' "git rev-list --objects $old --not --all" to see if everything that is reachable from $old is already connected to the existing refs was very inefficient. * jk/still-interesting: limit_list: avoid quadratic behavior from still_interesting	2015-05-11 14:23:43 -07:00
Jeff King	b6e8a3b540	limit_list: avoid quadratic behavior from still_interesting When we are limiting a rev-list traversal due to UNINTERESTING refs, we have to walk down the tips (both interesting and uninteresting) to find where they intersect. We keep a queue of commits to examine, pop commits off the queue one by one, and potentially add their parents. The size of the queue will naturally fluctuate based on the "width" of the history graph; i.e., the number of simultaneous lines of development. But for the most part it will stay in the same ballpark as the initial number of tips we fed, shrinking over time (as we hit common ancestors of the tips). So roughly speaking, if we start with `N` tips, we'll spend much of the time with a queue around `N` items. For each UNINTERESTING commit we pop, we call still_interesting to check whether marking its parents as UNINTERESTING has made the whole queue uninteresting (in which case we can quit early). Because the queue is stored as a linked list, this is `O(N)`, where `N` is the number of items in the queue. So processing a queue with `N` commits marked UNINTERESTING (and one or more interesting commits) will take `O(N^2)`. If you feed a lot of positive tips, this isn't a problem. They aren't UNINTERESTING, so they don't incur the still_interesting check. It also isn't a problem if you traverse from an interesting tip to some UNINTERESTING bases. We order the queue by recency, so the interesting commits stay at the front of the queue as we walk down them. The linear check can exit early as soon as it sees one interesting commit left in the queue. But if you want to know whether an older commit is reachable from a set of newer tips, we end up processing in the opposite direction: from the UNINTERESTING ones down to the interesting one. This may happen when we call: git rev-list $commits --not --all in check_everything_connected after a fetch. If we fetched something much older than most of our refs, and if we have a large number of refs, the traversal cost is dominated by the quadratic behavior. These commands simulate the connectivity check of such a fetch, when you have `$n` distinct refs in the receiver: # positive ref is 100,000 commits deep git rev-list --all \| head -100000 \| tail -1 >input # huge number of more recent negative refs git rev-list --all \| head -$n \| sed s/^/^/ >>input time git rev-list --stdin <input Here are timings for various `n` on the linux.git repository. The `n=1` case provides a baseline for just walking the commits, which lets us see the still_interesting overhead. The times marked with `+` subtract that baseline to show just the extra time growth due to the large number of refs. The `x` numbers show the slowdown of the adjusted time versus the prior trial. n \| before \| after -------------------------------------------------------- 1 \| 0.991s \| 0.848s 10000 \| 1.120s (+0.129s) \| 0.885s (+0.037s) 20000 \| 1.451s (+0.460s, 3.5x) \| 0.923s (+0.075s, 2.0x) 40000 \| 2.731s (+1.740s, 3.8x) \| 0.994s (+0.146s, 1.9x) 80000 \| 8.235s (+7.244s, 4.2x) \| 1.123s (+0.275s, 1.9x) Each trial doubles `n`, so you can see the quadratic (`4x`) behavior before this patch. Afterwards, we have a roughly linear relationship. The implementation is fairly straightforward. Whenever we do the linear search, we cache the interesting commit we find, and next time check it before doing another linear search. If that commit is removed from the list or becomes UNINTERESTING itself, then we fall back to the linear search. This is very similar to the trick used by `fce87ae` (Fix quadratic performance in rewrite_one., 2008-07-12). I considered and rejected several possible alternatives: 1. Keep a count of UNINTERESTING commits in the queue. This requires managing the count not only when removing an item from the queue, but also when marking an item as UNINTERESTING. That requires touching the other functions which mark commits, and would require knowing quickly which commits are in the queue (lookup in the queue is linear, so we would need an auxiliary structure or to also maintain an IN_QUEUE flag in each commit object). 2. Keep a separate list of interesting commits. Drop items from it when they are dropped from the queue, or if they become UNINTERESTING. This again suffers from extra complexity to maintain the list, not to mention CPU and memory. 3. Use a better data structure for the queue. This is something that could help the fix in `fce87ae`, because we order the queue by recency, and it is about inserting quickly in recency order. So a normal priority queue would help there. But here, we cannot disturb the order of the queue, which makes things harder. We really do need an auxiliary index to track the flag we care about, which is basically option (2) above. The "cache" trick is simple, and the numbers above show that it works well in practice. This is because the length of time it takes to find an interesting commit is proportional to the length of time it will remain cached (i.e., if we have to walk a long way to find it, it also means we have to pop a lot of elements in the queue until we get rid of it and have to find another interesting commit). The worst case is still quadratic, though. We could have `N` uninteresting commits at the front of the queue, followed by `N` interesting commits, where commit `i` has parent `i+N`. When we pop commit `i`, we will notice that the parent of the next commit, `i+1+N` is still interesting and cache it. But then handling commit `i+1`, we will mark its parent `i+1+N` uninteresting, and immediately invalidate our cache. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-04-17 15:22:05 -07:00
Junio C Hamano	dbd04eba01	Merge branch 'dj/log-graph-with-no-walk' "git log --graph --no-walk A B..." is a otcnflicting request that asks nonsense; no-walk tells us show discrete points in the history, while graph asks to draw connections between these discrete points. Forbid the combination. * dj/log-graph-with-no-walk: revision: forbid combining --graph and --no-walk	2015-03-25 12:54:22 -07:00
Junio C Hamano	257b204f25	Merge branch 'kd/rev-list-bisect-first-parent' "git rev-list --bisect --first-parent" does not work (yet) and can even cause SEGV; forbid it. "git log --bisect --first-parent" would not be useful until "git bisect --first-parent" materializes, so it is also forbidden for now. * kd/rev-list-bisect-first-parent: rev-list: refuse --first-parent combined with --bisect	2015-03-25 12:54:21 -07:00
Kevin Daudt	f88851c637	rev-list: refuse --first-parent combined with --bisect rev-list --bisect is used by git bisect, but never together with --first-parent. Because rev-list --bisect together with --first-parent is not handled currently, and even leads to segfaults, refuse to use both options together. Because this is not supported, it makes little sense to use git log --bisect --first parent either, because refs/heads/bad is not limited to the first parent chain. Helped-by: Junio C. Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Kevin Daudt <me@ikke.info> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-19 15:26:21 -07:00
Dongcan Jiang	695985f483	revision: forbid combining --graph and --no-walk Because "--graph" is about connected history while --no-walk is about discrete points, it does not make sense to allow these two options at the same time. [1] This change makes a few calls to "show --graph" fail in t4052, but asking to show one commit with graph is a nonsensical thing to do. Thus, tests on "show --graph" in t4052 have been removed [2,3]. Same tests on "show" without --graph option have already been tested in 4052. 3 testcases have been added to test this patch. [1]: http://article.gmane.org/gmane.comp.version-control.git/216083 [2]: http://article.gmane.org/gmane.comp.version-control.git/264950 [3]: http://article.gmane.org/gmane.comp.version-control.git/265107 Helped-By: Eric Sunshine <sunshine@sunshineco.com> Helped-By: René Scharfe <l.s.r@web.de> Helped-By: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-19 11:07:51 -07:00
Junio C Hamano	c985aaf879	Merge branch 'jc/unused-symbols' Mark file-local symbols as "static", and drop functions that nobody uses. * jc/unused-symbols: shallow.c: make check_shallow_file_for_update() static remote.c: make clear_cas_option() static urlmatch.c: make match_urls() static revision.c: make save_parents() and free_saved_parents() static line-log.c: make line_log_data_init() static pack-bitmap.c: make pack_bitmap_filename() static prompt.c: remove git_getpass() nobody uses http.c: make finish_active_slot() and handle_curl_result() static	2015-02-11 13:44:07 -08:00
Junio C Hamano	1ba6e860b9	Merge branch 'cj/log-invert-grep' "git log --invert-grep --grep=WIP" will show only commits that do not have the string "WIP" in their messages. * cj/log-invert-grep: log: teach --invert-grep option	2015-02-11 13:42:39 -08:00
Junio C Hamano	0131c49096	revision.c: make save_parents() and free_saved_parents() static No external callers exist. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-01-15 11:05:47 -08:00
Christoph Junghans	22dfa8a23d	log: teach --invert-grep option "git log --grep=<string>" shows only commits with messages that match the given string, but sometimes it is useful to be able to show only commits that do not have certain messages (e.g. "show me ones that are not FIXUP commits"). Originally, we had the invert-grep flag in grep_opt, but because "git grep --invert-grep" does not make sense except in conjunction with "--files-with-matches", which is already covered by "--files-without-matches", it was moved it to revisions structure. To have the flag there expresses the function to the feature better. When the newly inserted two tests run, the history would have commits with messages "initial", "second", "third", "fourth", "fifth", "sixth" and "Second", committed in this order. The commits that does not match either "th" or "Sec" is "second" and "initial". For the case insensitive case only "initial" matches. Signed-off-by: Christoph Junghans <ottxor@gentoo.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-01-13 10:20:32 -08:00
Junio C Hamano	832258da96	Merge branch 'bc/fetch-thin-less-aggressive-in-normal-repository' Earlier we made "rev-list --object-edge" more aggressively list the objects at the edge commits, in order to reduce number of objects fetched into a shallow repository, but the change affected cases other than "fetching into a shallow repository" and made it unusably slow (e.g. fetching into a normal repository should not have to suffer the overhead from extra processing). Limit it to a more specific case by introducing --objects-edge-aggressive, a new option to rev-list. * bc/fetch-thin-less-aggressive-in-normal-repository: pack-objects: use --objects-edge-aggressive for shallow repos rev-list: add an option to mark fewer edges as uninteresting Documentation: add missing article in rev-list-options.txt	2015-01-12 11:38:57 -08:00
Junio C Hamano	098501527f	Merge branch 'jc/merge-bases' The get_merge_bases() API was easy to misuse by careless copy&paste coders, leaving object flags tainted in the commits that needed to be traversed. jc/merge-bases: get_merge_bases(): always clean-up object flags bisect: clean flags after checking merge bases	2015-01-07 12:55:05 -08:00
brian m. carlson	1684c1b219	rev-list: add an option to mark fewer edges as uninteresting In commit `fbd4a70` (list-objects: mark more commits as edges in mark_edges_uninteresting - 2013-08-16), we marked an increasing number of edges uninteresting. This change, and the subsequent change to make this conditional on --objects-edge, are used by --thin to make much smaller packs for shallow clones. Unfortunately, they cause a significant performance regression when pushing non-shallow clones with lots of refs (23.322 seconds vs. 4.785 seconds with 22400 refs). Add an option to git rev-list, --objects-edge-aggressive, that preserves this more aggressive behavior, while leaving --objects-edge to provide more performant behavior. Preserve the current behavior for the moment by using the aggressive option. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-12-29 09:57:55 -08:00
Junio C Hamano	2ce406ccb8	get_merge_bases(): always clean-up object flags The callers of get_merge_bases() can choose to leave object flags used during the merge-base traversal by passing cleanup=0 as a parameter, but in practice a very few callers can afford to do so (namely, "git merge-base"), as they need to compute merge base in preparation for other processing of their own and they need to see the object without contaminate flags. Change the function signature of get_merge_bases_many() and get_merge_bases() to drop the cleanup parameter, so that the majority of the callers do not have to say ", 1" at the end. Give a new get_merge_bases_many_dirty() API to support only a few callers that know they do not need to spend cycles cleaning up the object flags. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-30 12:51:10 -07:00
Ramsay Jones	d7702be1e1	revision: remove definition of unused 'add_object' function Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:27:24 -07:00
Jeff King	4fe10219bc	rev-list: add --indexed-objects option There is currently no easy way to ask the revision traversal machinery to include objects reachable from the index (e.g., blobs and trees that have not yet been committed). This patch adds an option to do so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:07:07 -07:00
Jeff King	207394908e	traverse_commit_list: support pending blobs/trees with paths When we call traverse_commit_list, we may have trees and blobs in the pending array. As we process these, we pass the "name" field from the pending entry as the path of the object within the tree (which then becomes the root path if we recurse into subtrees). When we set up the traversal in prepare_revision_walk, though, the "name" field of any pending trees and blobs is likely to be the ref at which we found the object. We would not want to make this part of the path (e.g., doing so would make "git rev-list --objects v2.6.11-tree" in linux.git show paths like "v2.6.11-tree/Makefile", which is nonsensical). Therefore prepare_revision_walk sets the name field of each pending tree and blobs to the empty string. However, this leaves no room for a caller who does know the correct path of a pending object to propagate that information to the revision walker. We can fix this by making two related changes: 1. Use the "path" field as the path instead of the "name" field in traverse_commit_list. If the path is not set, default to "" (which is what we always ended up with in the current code, because of prepare_revision_walk). 2. In prepare_revision_walk, make a complete copy of the entry. This makes the path field available to the walker (if there is one), solving our problem. Leaving the name field intact is now OK, as we do not use it as a path due to point (1) above (and we can use it to make more meaningful error messages if we want). We also make the original "mode" field available to the walker, though it does not actually use it. Note that we still re-add the pending objects and free the old ones (so we may strdup the path and name only to free the old ones). This could be made more efficient by simply copying the object_array entries that we are keeping. However, that would require more restructuring of the code, and is not done here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:06:31 -07:00
Jeff King	718ccc9731	reachable: reuse revision.c "add all reflogs" code We want to add all reflog entries as tips for finding reachable objects. The revision machinery can already do this (to support "rev-list --reflog"); we can reuse that code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:38 -07:00
Jeff King	1da1e07c83	clean up name allocation in prepare_revision_walk When we enter prepare_revision_walk, we have zero or more entries in our "pending" array. We disconnect that array from the rev_info, and then process each entry: 1. If the entry is a commit and the --source option is in effect, we keep a pointer to the object name. 2. Otherwise, we re-add the item to the pending list with a blank name. We then throw away the old array by freeing the array itself, but do not touch the "name" field of each entry. For any items of type (2), we leak the memory associated with the name. This commit fixes that by calling object_array_clear, which handles the cleanup for us. That breaks (1), though, because it depends on the memory pointed to by the name to last forever. We can solve that by making a copy of the name. This is slightly less efficient, but it shouldn't matter in practice, as we do it only for the tip commits of the traversal. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:37 -07:00
René Scharfe	2756ca4347	use REALLOC_ARRAY for changing the allocation size of arrays Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-09-18 09:13:42 -07:00
Junio C Hamano	1ebe6a825a	Merge branch 'jk/name-decoration-alloc' The API to allocate the structure to keep track of commit decoration was cumbersome to use, inviting lazy code to overallocate memory. * jk/name-decoration-alloc: log-tree: use FLEX_ARRAY in name_decoration log-tree: make name_decoration hash static log-tree: make add_name_decoration a public function	2014-09-11 10:33:36 -07:00
Jeff King	2608c24940	log-tree: make name_decoration hash static In the previous commit, we made add_name_decoration global so that adders would not have to access the hash directly. We now make the hash itself static so that callers _have_ to add through our function, making sure that all additions go through a single point. To do this, we have to add one more accessor function: a way to lookup entries in the hash. Since the only caller doesn't actually look at the returned value, but rather only asks whether there is a decoration or not, we could provide only a boolean "has_name_decoration". That would allow us to make "struct name_decoration" local to log-tree, as well. However, it's unlikely to cause any maintainability harm making the actual data public, and this interface is more flexible if we need to look at decorations from other parts of the code in the future. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-26 10:34:26 -07:00
Jeff King	ae18165fbb	revision: drop useless string offset when parsing "--pretty" Once upon a time, we parsed pretty options by looking for "--pretty" at the start of the string, and then feeding the rest (including an "=") to get_commit_format. Later, commit `48ded91` (log --pretty: do not accept bogus "--prettyshort", 2008-05-25) split this into a separate check for "--pretty" versus "--pretty=". However, when parsing "--pretty", we still passed "arg+8" to get_commit_format. This is useless, since it will always point to the NUL terminator at the end of the string. We can simply pass NULL instead; both parameters are treated the same by get_commit_format. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-30 12:30:02 -07:00
Junio C Hamano	5c18fde0d9	Merge branch 'jk/commit-buffer-length' into maint A handful of code paths had to read the commit object more than once when showing header fields that are usually not parsed. The internal data structure to keep track of the contents of the commit object has been updated to reduce the need for this double-reading, and to allow the caller find the length of the object. * jk/commit-buffer-length: reuse cached commit buffer when parsing signatures commit: record buffer length in cache commit: convert commit->buffer to a slab commit-slab: provide a static initializer use get_commit_buffer everywhere convert logmsg_reencode to get_commit_buffer use get_commit_buffer to avoid duplicate code use get_cached_commit_buffer where appropriate provide helpers to access the commit buffer provide a helper to set the commit buffer provide a helper to free commit buffer sequencer: use logmsg_reencode in get_message logmsg_reencode: return const buffer do not create "struct commit" with xcalloc commit: push commit_index update into alloc_commit_node alloc: include any-object allocations in alloc_report replace dangerous uses of strbuf_attach commit_tree: take a pointer/len pair rather than a const strbuf	2014-07-16 11:16:38 -07:00
Junio C Hamano	8061ae8b46	Merge branch 'jk/commit-buffer-length' Move "commit->buffer" out of the in-core commit object and keep track of their lengths. Use this to optimize the code paths to validate GPG signatures in commit objects. * jk/commit-buffer-length: reuse cached commit buffer when parsing signatures commit: record buffer length in cache commit: convert commit->buffer to a slab commit-slab: provide a static initializer use get_commit_buffer everywhere convert logmsg_reencode to get_commit_buffer use get_commit_buffer to avoid duplicate code use get_cached_commit_buffer where appropriate provide helpers to access the commit buffer provide a helper to set the commit buffer provide a helper to free commit buffer sequencer: use logmsg_reencode in get_message logmsg_reencode: return const buffer do not create "struct commit" with xcalloc commit: push commit_index update into alloc_commit_node alloc: include any-object allocations in alloc_report replace dangerous uses of strbuf_attach commit_tree: take a pointer/len pair rather than a const strbuf	2014-07-02 12:53:02 -07:00
Junio C Hamano	8675779454	Merge branch 'jc/shortlog-ref-exclude' into maint "git log --exclude=<glob> --all \| git shortlog" worked as expected, but "git shortlog --exclude=<glob> --all", which is supposed to be identical to the above pipeline, was not accepted at the command line argument parser level. * jc/shortlog-ref-exclude: shortlog: allow --exclude=<glob> to be passed	2014-06-25 11:49:11 -07:00
Junio C Hamano	91043fc95c	Merge branch 'jc/revision-dash-count-parsing' into maint "git log -2master" is a common typo that shows two commits starting from whichever random branch that is not 'master' that happens to be checked out currently. * jc/revision-dash-count-parsing: revision: parse "git log -<count>" more carefully	2014-06-25 11:44:53 -07:00
Junio C Hamano	7a3b4e3bd2	Merge branch 'jc/revision-dash-count-parsing' "git log -2master" is a common typo that shows two commits starting from whichever random branch that is not 'master' that happens to be checked out currently. * jc/revision-dash-count-parsing: revision: parse "git log -<count>" more carefully	2014-06-20 13:10:25 -07:00
Jeff King	b66103c3ba	convert logmsg_reencode to get_commit_buffer Like the callsites in the previous commit, logmsg_reencode already falls back to read_sha1_file when necessary. However, I split its conversion out into its own commit because it's a bit more complex. We return either: 1. The original commit->buffer 2. A newly allocated buffer from read_sha1_file 3. A reencoded buffer (based on either 1 or 2 above). while trying to do as few extra reads/allocations as possible. Callers currently free the result with logmsg_free, but we can simplify this by pointing them straight to unuse_commit_buffer. This is a slight layering violation, in that we may be passing a buffer from (3). However, since the end result is to free() anything except (1), which is unlikely to change, and because this makes the interface much simpler, it's a reasonable bending of the rules. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:08:17 -07:00
Jeff King	b000c59b0c	logmsg_reencode: return const buffer The return value from logmsg_reencode may be either a newly allocated buffer or a pointer to the existing commit->buffer. We would not want the caller to accidentally free() or modify the latter, so let's mark it as const. We can cast away the constness in logmsg_free, but only once we have determined that it is a free-able buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-12 10:29:43 -07:00
Junio C Hamano	e3fa568cb3	revision: parse "git log -<count>" more carefully This mistyped command line simply ignores "master" and ends up showing two commits from the current HEAD: $ git log -2master because we feed "2master" to atoi() without making sure that the whole string is parsed as an integer. Use the strtol_i() helper function instead. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-09 14:53:49 -07:00
Junio C Hamano	07768e03b5	Merge branch 'jc/shortlog-ref-exclude' "log --exclude=<glob> --all \| shortlog" worked as expected, but "shortlog --exclude=<glob> --all" was not accepted at the command line argument parser level. * jc/shortlog-ref-exclude: shortlog: allow --exclude=<glob> to be passed	2014-06-09 11:30:13 -07:00
Junio C Hamano	eb077745a4	shortlog: allow --exclude=<glob> to be passed These two commands are supposed to be equivalent: $ git log --exclude=refs/notes/\* --all --no-merges --since=2.days \| git shortlog $ git shortlog --exclude=refs/notes/\* --all --no-merges --since=2.days However, the latter does not understand the ref-exclusion command line option, even though other options understood by "log", such as "--all" and "--no-merges", are understood. This was because `e7b432c5` (revision: introduce --exclude=<glob> to tame wildcards, 2013-08-30) did not wire the new option fully to the machinery. A new option understood by handle_revision_pseudo_opt() must be told to handle_revision_opt() as well. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-04 13:41:33 -07:00
Junio C Hamano	967f8c9184	Merge branch 'jk/pack-bitmap' * jk/pack-bitmap: pack-objects: do not reuse packfiles without --delta-base-offset add `ignore_missing_links` mode to revwalk	2014-04-08 12:00:33 -07:00
Vicent Marti	2db1a43f41	add `ignore_missing_links` mode to revwalk When pack-objects is computing the reachability bitmap to serve a fetch request, it can erroneously die() if some of the UNINTERESTING objects are not present. Upload-pack throws away HAVE lines from the client for objects we do not have, but we may have a tip object without all of its ancestors (e.g., if the tip is no longer reachable and was new enough to survive a `git prune`, but some of its reachable objects did get pruned). In the non-bitmap case, we do a revision walk with the HAVE objects marked as UNINTERESTING. The revision walker explicitly ignores errors in accessing UNINTERESTING commits to handle this case (and we do not bother looking at UNINTERESTING trees or blobs at all). When we have bitmaps, however, the process is quite different. The bitmap index for a pack-objects run is calculated in two separate steps: First, we perform an extensive walk from all the HAVEs to find the full set of objects reachable from them. This walk is usually optimized away because we are expected to hit an object with a bitmap during the traversal, which allows us to terminate early. Secondly, we perform an extensive walk from all the WANTs, which usually also terminates early because we hit a commit with an existing bitmap. Once we have the resulting bitmaps from the two walks, we AND-NOT them together to obtain the resulting set of objects we need to pack. When we are walking the HAVE objects, the revision walker does not know that we are walking it only to mark the results as uninteresting. We strip out the UNINTERESTING flag, because those objects _are_ interesting to us during the first walk. We want to keep going to get a complete set of reachable objects if we can. We need some way to tell the revision walker that it's OK to silently truncate the HAVE walk, just like it does for the UNINTERESTING case. This patch introduces a new `ignore_missing_links` flag to the `rev_info` struct, which we set only for the HAVE walk. It also adds tests to cover UNINTERESTING objects missing from several positions: a missing blob, a missing tree, and a missing parent commit. The missing blob already worked (as we do not care about its contents at all), but the other two cases caused us to die(). Note that there are a few cases we do not need to test: 1. We do not need to test a missing tree, with the blob still present. Without the tree that refers to it, we would not know that the blob is relevant to our walk. 2. We do not need to test a tip commit that is missing. Upload-pack omits these for us (and in fact, we complain even in the non-bitmap case if it fails to do so). Reported-by: Siddharth Agarwal <sid0@fb.com> Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-04 13:31:38 -07:00
Junio C Hamano	b407d40933	Merge branch 'nd/log-show-linear-break' Attempts to show where a single-strand-of-pearls break in "git log" output. * nd/log-show-linear-break: log: add --show-linear-break to help see non-linear history object.h: centralize object flag allocation	2014-04-03 12:38:11 -07:00
Nguyễn Thái Ngọc Duy	1b32decefd	log: add --show-linear-break to help see non-linear history Option explanation is in rev-list-options.txt. The interaction with -z is left undecided. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-25 15:09:49 -07:00
Junio C Hamano	d4c6e9fb6f	Merge branch 'jk/warn-on-object-refname-ambiguity' * jk/warn-on-object-refname-ambiguity: rev-list: disable object/refname ambiguity check with --stdin cat-file: restore warn_on_object_refname_ambiguity flag cat-file: fix a minor memory leak in batch_objects cat-file: refactor error handling of batch_objects	2014-03-25 11:07:36 -07:00
Junio C Hamano	650c90a185	Merge branch 'nd/no-more-fnmatch' We started using wildmatch() in place of fnmatch(3); complete the process and stop using fnmatch(3). * nd/no-more-fnmatch: actually remove compat fnmatch source code stop using fnmatch (either native or compat) Revert "test-wildmatch: add "perf" command to compare wildmatch and fnmatch" use wildmatch() directly without fnmatch() wrapper	2014-03-14 14:25:31 -07:00
Jeff King	4c30d50402	rev-list: disable object/refname ambiguity check with --stdin This is the "rev-list" analogue to `25fba78` (cat-file: disable object/refname ambiguity check for batch mode, 2013-07-12). Like cat-file, "rev-list --stdin" may read a large number of sha1 object names, and the warning check introduces a significant slow-down. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-13 11:56:29 -07:00
Junio C Hamano	0f9e62e084	Merge branch 'jk/pack-bitmap' Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...	2014-02-27 14:01:48 -08:00
Junio C Hamano	795dd116bb	Merge branch 'ks/tree-diff-walk' * ks/tree-diff-walk: tree-walk: finally switch over tree descriptors to contain a pre-parsed entry revision: convert to using diff_tree_sha1() line-log: convert to using diff_tree_sha1() tree-diff: convert diff_root_tree_sha1() to just call diff_tree_sha1 with old=NULL tree-diff: allow diff_tree_sha1 to accept NULL sha1	2014-02-27 14:01:39 -08:00
Nguyễn Thái Ngọc Duy	429bb40abd	pathspec: convert some match_pathspec_depth() to ce_path_match() This helps reduce the number of match_pathspec_depth() call sites and show how match_pathspec_depth() is used. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:36:52 -08:00
Nguyễn Thái Ngọc Duy	eb07894fe0	use wildmatch() directly without fnmatch() wrapper Make it clear that we don't use fnmatch() anymore. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-20 14:15:46 -08:00
Junio C Hamano	5032098614	Merge branch 'jc/revision-range-unpeel' into maint "git log --left-right A...B" lost the "leftness" of commits reachable from A when A is a tag as a side effect of a recent bugfix. This is a regression in 1.8.4.x series. * jc/revision-range-unpeel: revision: propagate flag bits from tags to pointees revision: mark contents of an uninteresting tree uninteresting	2014-02-13 13:38:47 -08:00
Kirill Smelkov	6275c91c08	revision: convert to using diff_tree_sha1() Since diff_tree_sha1() can now accept empty trees via NULL sha1, we could just call it without manually reading trees into tree_desc and duplicating code. Besides, that if (!tree) return 0; looked suspect - we were saying an invalid tree != empty tree, but maybe it is better to just say the tree is invalid here, which is what diff_tree_sha1() does for such case. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-05 10:51:16 -08:00
Junio C Hamano	63763273de	Merge branch 'jc/revision-range-unpeel' "git log --left-right A...B" lost the "leftness" of commits reachable from A when A is a tag as a side effect of a recent bugfix. This is a regression in 1.8.4.x series. * jc/revision-range-unpeel: revision: propagate flag bits from tags to pointees revision: mark contents of an uninteresting tree uninteresting	2014-01-27 10:44:10 -08:00
Junio C Hamano	a74352867e	revision: propagate flag bits from tags to pointees With the previous fix `895c5ba3` (revision: do not peel tags used in range notation, 2013-09-19), handle_revision_arg() that processes command line arguments for the "git log" family of commands no longer directly places the object pointed by the tag in the pending object array when it sees a tag object. We used to place pointee there after copying the flag bits like UNINTERESTING and SYMMETRIC_LEFT. This change meant that any flag that is relevant to later history traversal must now be propagated to the pointed objects (most often these are commits) while starting the traversal, which is partly done by handle_commit() that is called from prepare_revision_walk(). We did propagate UNINTERESTING, but did not do so for others, most notably SYMMETRIC_LEFT. This caused "git log --left-right v1.0..." (where "v1.0" is a tag) to start losing the "leftness" from the commit the tag points at. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 15:53:51 -08:00
Junio C Hamano	2ac5e4470b	revision: mark contents of an uninteresting tree uninteresting "git rev-list --objects ^A^{tree} B^{tree}" ought to mean "I want a list of objects inside B's tree, but please exclude the objects that appear inside A's tree". we see the top-level tree marked as uninteresting (i.e. ^A^{tree} in the above example) and call mark_tree_uninteresting() on it; this unfortunately prevents us from recursing into the tree and marking the objects in the tree as uninteresting. The reason why "git log ^A A" yields an empty set of commits, i.e. we do not have a similar issue for commits, is because we call mark_parents_uninteresting() after seeing an uninteresting commit. The uninteresting-ness of the commit itself does not prevent its parents from being marked as uninteresting. Introduce mark_tree_contents_uninteresting() and structure the code in handle_commit() in such a way that it makes it the responsibility of the callchain leading to this function to mark commits, trees and blobs as uninteresting, and also make it the responsibility of the helpers called from this function to mark objects that are reachable from them. Note that this is a very old bug that probably dates back to the day when "rev-list --objects" was introduced. The line to clear tree->object.parsed at the end of mark_tree_contents_uninteresting() can be removed when this fix is merged to the codebase after `6e454b9a` (clear parsed flag when we free tree buffers, 2013-06-05). Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 15:48:58 -08:00
Junio C Hamano	ad70448576	Merge branch 'cc/starts-n-ends-with' Remove a few duplicate implementations of prefix/suffix comparison functions, and rename them to starts_with and ends_with. * cc/starts-n-ends-with: replace {pre,suf}fixcmp() with {starts,ends}_with() strbuf: introduce starts_with() and ends_with() builtin/remote: remove postfixcmp() and use suffixcmp() instead environment: normalize use of prefixcmp() by removing " != 0"	2013-12-17 12:02:44 -08:00
Christian Couder	5955654823	replace {pre,suf}fixcmp() with {starts,ends}_with() Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c \| grep -v strbuf\\.c \| xargs perl -pi -e ' s\|!prefixcmp\(\|starts_with\(\|g; s\|prefixcmp\(\|!starts_with\(\|g; s\|!suffixcmp\(\|ends_with\(\|g; s\|suffixcmp\(\|!ends_with\(\|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-05 14:13:21 -08:00
Junio C Hamano	10167eb251	Merge branch 'jc/ref-excludes' People often wished a way to tell "git log --branches" (and "git log --remotes --not --branches") to exclude some local branches from the expansion of "--branches" (similarly for "--tags", "--all" and "--glob=<pattern>"). Now they have one. * jc/ref-excludes: rev-parse: introduce --exclude=<glob> to tame wildcards rev-list --exclude: export add/clear-ref-exclusion and ref-excluded API rev-list --exclude: tests document --exclude option revision: introduce --exclude=<glob> to tame wildcards	2013-12-05 12:59:09 -08:00
Junio C Hamano	c6f1b920ac	Merge branch 'nd/literal-pathspecs' Fixes a regression on 'master' since v1.8.4. * nd/literal-pathspecs: pathspec: stop --*-pathspecs impact on internal parse_pathspec() uses	2013-11-18 14:31:29 -08:00
Junio C Hamano	ff32d3420a	rev-list --exclude: export add/clear-ref-exclusion and ref-excluded API ... while updating their function signature. To be squashed into the initial patch to rev-list. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-11-01 13:09:24 -07:00
Felipe Contreras	9e57ac55ce	revision: trivial style fixes Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-31 13:48:05 -07:00
Junio C Hamano	4cebbe6f55	Merge branch 'nd/magic-pathspec' All callers to parse_pathspec() must choose between getting no pathspec or one path that is limited to the current directory when there is no paths given on the command line, but there were two callers that violated this rule, triggering a BUG(). * nd/magic-pathspec: Fix calling parse_pathspec with no paths nor PATHSPEC_PREFER_* flags	2013-10-30 12:10:33 -07:00
Junio C Hamano	2d99baab2f	Merge branch 'jc/revision-range-unpeel' "git rev-list --objects ^v1.0^ v1.0" gave v1.0 tag itself in the output, but "git rev-list --objects v1.0^..v1.0" did not. * jc/revision-range-unpeel: revision: do not peel tags used in range notation	2013-10-28 10:43:16 -07:00
Nguyễn Thái Ngọc Duy	4a2d5ae262	pathspec: stop ---pathspecs impact on internal parse_pathspec() uses Normally parse_pathspec() is used on command line arguments where it can do fancy thing like parsing magic on each argument or adding magic for all pathspecs based on ---pathspecs options. There's another use of parse_pathspec(), where pathspec is needed, but the input is known to be pure paths. In this case we usually don't want ---pathspecs to interfere. And we definitely do not want to parse magic in these paths, regardless of --literal-pathspecs. Add new flag PATHSPEC_LITERAL_PATH for this purpose. When it's set, ---pathspecs are ignored, no magic is parsed. And if the caller allows PATHSPEC_LITERAL (i.e. the next calls can take literal magic), then PATHSPEC_LITERAL will be set. This fixes cases where git chokes when GIT__PATHSPECS are set because parse_pathspec() indicates it won't take any magic. But GIT__PATHSPECS add them anyway. These are export GIT_LITERAL_PATHSPECS=1 git blame -- something git log --follow something git log --merge "git ls-files --with-tree=path" (aka parse_pathspec() in overlay_tree_on_cache()) is safe because the input is empty, and producing one pathspec due to PATHSPEC_PREFER_CWD does not take any magic into account. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-28 09:57:36 -07:00
Vicent Marti	a330de31d1	revision: allow setting custom limiter function This commit enables users of `struct rev_info` to peform custom limiting during a revision walk (i.e. `get_revision`). If the field `include_check` has been set to a callback, this callback will be issued once for each commit before it is added to the "pending" list of the revwalk. If the include check returns 0, the commit will be marked as added but won't be pushed to the pending list, effectively limiting the walk. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-24 15:44:52 -07:00
Nguyễn Thái Ngọc Duy	c8556c6213	Fix calling parse_pathspec with no paths nor PATHSPEC_PREFER_* flags When parse_pathspec() is called with no paths, the behavior could be either return no paths, or return one path that is cwd. Some commands do the former, some the latter. parse_pathspec() itself does not make either the default and requires the caller to specify either flag if it may run into this situation. I've grep'd through all parse_pathspec() call sites. Some pass neither, but those are guaranteed never pass empty path to parse_pathspec(). There are two call sites that may pass empty path and are fixed with this patch. [jc: added a test from Antoine's bug report] Reported-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-22 10:49:43 -07:00
Junio C Hamano	895c5ba3c1	revision: do not peel tags used in range notation A range notation "A..B" means exactly the same thing as what "^A B" means, i.e. the set of commits that are reachable from B but not from A. But the internal representation after the revision parser parsed these two notations are subtly different. - "rev-list ^A B" leaves A and B in the revs->pending.objects[] array, with the former marked as UNINTERESTING and the revision traversal machinery propagates the mark to underlying commit objects A^0 and B^0. - "rev-list A..B" peels tags and leaves A^0 (marked as UNINTERESTING) and B^0 in revs->pending.objects[] array before the traversal machinery kicks in. This difference usually does not matter, but starts to matter when the --objects option is used. For example, we see this: $ git rev-list --objects v1.8.4^1..v1.8.4 \| grep $(git rev-parse v1.8.4) $ git rev-list --objects v1.8.4 ^v1.8.4^1 \| grep $(git rev-parse v1.8.4) 04f013dc38d7512eadb915eba22efc414f18b869 v1.8.4 With the former invocation, the revision traversal machinery never hears about the tag v1.8.4 (it only sees the result of peeling it, i.e. the commit v1.8.4^0), and the tag itself does not appear in the output. The latter does send the tag object itself to the output. Make the range notation keep the unpeeled objects and feed them to the traversal machinery to fix this inconsistency. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-15 16:17:09 -07:00
Junio C Hamano	f406140baa	Merge branch 'fc/at-head' Instead of typing four capital letters "HEAD", you can say "@" now, e.g. "git log @". * fc/at-head: Add new @ shortcut for HEAD sha1-name: pass len argument to interpret_branch_name()	2013-09-20 12:38:10 -07:00
Junio C Hamano	b8f23112f0	Merge branch 'jk/free-tree-buffer' * jk/free-tree-buffer: clear parsed flag when we free tree buffers	2013-09-17 11:37:33 -07:00
Junio C Hamano	b02f5aeda6	Merge branch 'jl/submodule-mv' "git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...	2013-09-09 14:36:15 -07:00
Felipe Contreras	cf99a761d3	sha1-name: pass len argument to interpret_branch_name() This is useful to make sure we don't step outside the boundaries of what we are interpreting at the moment. For example while interpreting foobar@{u}~1, the job of interpret_branch_name() ends right before ~1, but there's no way to figure that out inside the function, unless the len argument is passed. So let's do that. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-03 11:33:00 -07:00
Junio C Hamano	e7b432c521	revision: introduce --exclude=<glob> to tame wildcards People often find "git log --branches" etc. that includes _all_ branches is cumbersome to use when they want to grab most but except some. The same applies to --tags, --all and --glob. Teach the revision machinery to remember patterns, and then upon the next such a globbing option, exclude those that match the pattern. With this, I can view only my integration branches (e.g. maint, master, etc.) without topic branches, which are named after two letters from primary authors' names, slash and topic name. git rev-list --no-walk --exclude=??/* --branches \| git name-rev --refs refs/heads/* --stdin This one shows things reachable from local and remote branches that have not been merged to the integration branches. git log --remotes --branches --not --exclude=??/* --branches It may be a bit rough around the edges, in that the pattern to give the exclude option depends on what globbing option follows. In these examples, the pattern "??/" is used, not "refs/heads/??/", because the globbing option that follows the -"-exclude=<pattern>" is "--branches". As each use of globbing option resets previously set "--exclude", this may not be such a bad thing, though. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-30 16:37:55 -07:00
Thomas Rast	838f9a1566	log: use true parents for diff when walking reflogs The reflog walking logic (git log -g) replaces the true parent list with the preceding commit in the reflog. This results in bogus commit diffs when combined with options such as -p; the diff is against the reflog predecessor, not the parent of the commit. Save the true parents on the side, extending the functions from the previous commit. The diff logic picks them up and uses them to show the correct diffs. We do have to be somewhat careful about repeated calling of save_parents(), since the reflog may list a commit more than once. We now store (commit_list*)-1 to distinguish the "not saved yet" and "root commit" cases. This lets us preserve an empty parent list even if save_parents() is repeatedly called. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-05 08:27:00 -07:00
Thomas Rast	53d00b39ce	log: use true parents for diff even when rewriting When using pathspec filtering in combination with diff-based log output, parent simplification happens before the diff is computed. The diff is therefore against the simplified parents. This works okay, arguably by accident, in the normal case: simplification reduces to one parent as long as the commit is TREESAME to it. So the simplified parent of any given commit must have the same tree contents on the filtered paths as its true (unfiltered) parent. However, --full-diff breaks this guarantee, and indeed gives pretty spectacular results when comparing the output of git log --graph --stat ... git log --graph --full-diff --stat ... (--graph internally kicks in parent simplification, much like --parents). To fix it, store a copy of the parent list before simplification (in a slab) whenever --full-diff is in effect. Then use the stored parents instead of the simplified ones in the commit display code paths. The latter do not actually check for --full-diff to avoid duplicated code; they just grab the original parents if save_parents() has not been called for this revision walk. For ordinary commits it should be obvious that this is the right thing to do. Merge commits are a bit subtle. Observe that with default simplification, merge simplification is an all-or-nothing decision: either the merge is TREESAME to one parent and disappears, or it is different from all parents and the parent list remains intact. Redundant parents are not pruned, so the existing code also shows them as a merge. So if we do show a merge commit, the parent list just consists of the rewrite result on each parent. Running, e.g., --cc on this in --full-diff mode is not very useful: if any commits were skipped, some hunks will disagree with all sides of the merge (with one side, because commits were skipped; with the others, because they didn't have those changes in the first place). This triggers --cc showing these hunks spuriously. Therefore I believe that even for merge commits it is better to show the diffs wrt. the original parents. Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-01 10:25:48 -07:00
Nguyễn Thái Ngọc Duy	9a08727443	remove init_pathspec() in favor of parse_pathspec() While at there, move free_pathspec() to pathspec.c Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	bd1928df1d	remove diff_tree_{setup,release}_paths Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	0fdc2ae512	convert some get_pathspec() calls to parse_pathspec() These call sites follow the pattern: paths = get_pathspec(prefix, argv); init_pathspec(&pathspec, paths); which can be converted into a single parse_pathspec() call. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy	9c5e6c802c	Convert "struct cache_entry " to "const ..." wherever possible I attempted to make index_state->cache[] a "const struct cache_entry " to find out how existing entries in index are modified and where. The question I have is what do we do if we really need to keep track of on-disk changes in the index. The result is - diff-lib.c: setting CE_UPTODATE - name-hash.c: setting CE_HASHED - preload-index.c, read-cache.c, unpack-trees.c and builtin/update-index: obvious - entry.c: write_entry() may refresh the checked out entry via fill_stat_cache_info(). This causes "non-const struct cache_entry " in builtin/apply.c, builtin/checkout-index.c and builtin/checkout.c - builtin/ls-files.c: --with-tree changes stagemask and may set CE_UPDATE Of these, write_entry() and its call sites are probably most interesting because it modifies on-disk info. But this is stat info and can be retrieved via refresh, at least for porcelain commands. Other just uses ce_flags for local purposes. So, keeping track of "dirty" entries is just a matter of setting a flag in index modification functions exposed by read-cache.c. Except unpack-trees, the rest of the code base does not do anything funny behind read-cache's back. The actual patch is less valueable than the summary above. But if anyone wants to re-identify the above sites. Applying this patch, then this: diff --git a/cache.h b/cache.h index 430d021..1692891 100644 --- a/cache.h +++ b/cache.h @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode) #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1) struct index_state { - struct cache_entry cache; + const struct cache_entry cache; unsigned int version; unsigned int cache_nr, cache_alloc, cache_changed; struct string_list *resolve_undo; will help quickly identify them without bogus warnings. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-09 09:12:48 -07:00
Junio C Hamano	534f0e0996	Merge branch 'jc/topo-author-date-sort' "git log" learned the "--author-date-order" option, with which the output is topologically sorted and commits in parallel histories are shown intermixed together based on the author timestamp. * jc/topo-author-date-sort: t6003: add --author-date-order test topology tests: teach a helper to set author dates as well t6003: add --date-order test topology tests: teach a helper to take abbreviated timestamps t/lib-t6000: style fixes log: --author-date-order sort-in-topological-order: use prio-queue prio-queue: priority queue of pointers to structs toposort: rename "lifo" field	2013-07-01 12:41:23 -07:00
Junio C Hamano	ede63a195c	Merge branch 'mh/reflife' Define memory ownership and lifetime rules for what for-each-ref feeds to its callbacks (in short, "you do not own it, so make a copy if you want to keep it"). * mh/reflife: (25 commits) refs: document the lifetime of the args passed to each_ref_fn register_ref(): make a copy of the bad reference SHA-1 exclude_existing(): set existing_refs.strdup_strings string_list_add_refs_by_glob(): add a comment about memory management string_list_add_one_ref(): rename first parameter to "refname" show_head_ref(): rename first parameter to "refname" show_head_ref(): do not shadow name of argument add_existing(): do not retain a reference to sha1 do_fetch(): clean up existing_refs before exiting do_fetch(): reduce scope of peer_item object_array_entry: fix memory handling of the name field find_first_merges(): remove unnecessary code find_first_merges(): initialize merges variable using initializer fsck: don't put a void-shaped peg in a char-shaped hole object_array_remove_duplicates(): rewrite to reduce copying revision: use object_array_filter() in implementation of gc_boundary() object_array: add function object_array_filter() revision: split some overly-long lines cmd_diff(): make it obvious which cases are exclusive of each other cmd_diff(): rename local variable "list" -> "entry" ...	2013-06-14 08:46:14 -07:00
Junio C Hamano	b27a79d16b	Merge branch 'kb/full-history-compute-treesame-carefully-2' Major update to the revision traversal logic to improve culling of irrelevant parents while traversing a mergy history. * kb/full-history-compute-treesame-carefully-2: revision.c: make default history consider bottom commits revision.c: don't show all merges for --parents revision.c: discount side branches when computing TREESAME revision.c: add BOTTOM flag for commits simplify-merges: drop merge from irrelevant side branch simplify-merges: never remove all TREESAME parents t6012: update test for tweaked full-history traversal revision.c: Make --full-history consider more merges Documentation: avoid "uninteresting" rev-list-options.txt: correct TREESAME for P t6111: add parents to tests t6111: allow checking the parents as well t6111: new TREESAME test set t6019: test file dropped in -s ours merge decorate.c: compact table when growing	2013-06-14 08:45:59 -07:00
Junio C Hamano	81c6b38b67	log: --author-date-order Sometimes people would want to view the commits in parallel histories in the order of author dates, not committer dates. Teach "topo-order" sort machinery to do so, using a commit-info slab to record the author dates of each commit, and prio-queue to sort them. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-11 15:15:21 -07:00
Junio C Hamano	08f704f294	toposort: rename "lifo" field The primary invariant of sort_in_topological_order() is that a parent commit is not emitted until all children of it are. When traversing a forked history like this with "git log C E": A----B----C \ D----E we ensure that A is emitted after all of B, C, D, and E are done, B has to wait until C is done, and D has to wait until E is done. In some applications, however, we would further want to control how these child commits B, C, D and E on two parallel ancestry chains are shown. Most of the time, we would want to see C and B emitted together, and then E and D, and finally A (i.e. the --topo-order output). The "lifo" parameter of the sort_in_topological_order() function is used to control this behaviour. We start the traversal by knowing two commits, C and E. While keeping in mind that we also need to inspect E later, we pick C first to inspect, and we notice and record that B needs to be inspected. By structuring the "work to be done" set as a LIFO stack, we ensure that B is inspected next, before other in-flight commits we had known that we will need to inspect, e.g. E. When showing in --date-order, we would want to see commits ordered by timestamps, i.e. show C, E, B and D in this order before showing A, possibly mixing commits from two parallel histories together. When "lifo" parameter is set to false, the function keeps the "work to be done" set sorted in the date order to realize this semantics. After inspecting C, we add B to the "work to be done" set, but the next commit we inspect from the set is E which is newer than B. The name "lifo", however, is too strongly tied to the way how the function implements its behaviour, and does not describe what the behaviour _means_. Replace this field with an enum rev_sort_order, with two possible values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and update the existing code. The mechanical replacement rule is: "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE" "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER" Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-11 15:15:21 -07:00
Jeff King	6e454b9a31	clear parsed flag when we free tree buffers Many code paths will free a tree object's buffer and set it to NULL after finishing with it in order to keep memory usage down during a traversal. However, out of 8 sites that do this, only one actually unsets the "parsed" flag back. Those sites that don't are setting a trap for later users of the tree object; even after calling parse_tree, the buffer will remain NULL, causing potential segfaults. It is not known whether this is triggerable in the current code. Most commands do not do an in-memory traversal followed by actually using the objects again. However, it does not hurt to be safe for future callers. In most cases, we can abstract this out to a "free_tree_buffer" helper. However, there are two exceptions: 1. The fsck code relies on the parsed flag to know that we were able to parse the object at one point. We can switch this to using a flag in the "flags" field. 2. The index-pack code sets the buffer to NULL but does not free it (it is freed by a caller). We should still unset the parsed flag here, but we cannot use our helper, as we do not want to free the buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-06 10:29:12 -07:00
Junio C Hamano	ed73fe5642	Merge branch 'tr/line-log' * tr/line-log: git-log(1): remove --full-line-diff description line-log: fix documentation formatting log -L: improve comments in process_all_files() log -L: store the path instead of a diff_filespec log -L: test merge of parallel modify/rename t4211: pass -M to 'git log -M -L...' test log -L: fix overlapping input ranges log -L: check range set invariants when we look it up Speed up log -L... -M log -L: :pattern:file syntax to find by funcname Implement line-history search (git log -L) Export rewrite_parents() for 'log -L' Refactor parse_loc	2013-06-02 16:00:44 -07:00
Michael Haggerty	31faeb2088	object_array_entry: fix memory handling of the name field Previously, the memory management of the object_array_entry::name field was inconsistent and undocumented. object_array_entries are ultimately created by a single function, add_object_array_with_mode(), which has an argument "const char name". This function used to simply set the name field to reference the string pointed to by the name parameter, and nobody on the object_array side ever freed the memory. Thus, it assumed that the memory for the name field would be managed by the caller, and that the lifetime of that string would be at least as long as the lifetime of the object_array_entry. But callers were inconsistent: Some passed pointers to constant strings or argv entries, which was OK. * Some passed pointers to newly-allocated memory, but didn't arrange for the memory ever to be freed. * Some passed the return value of sha1_to_hex(), which is a pointer to a statically-allocated buffer that can be overwritten at any time. * Some passed pointers to refnames that they received from a for_each_ref()-type iteration, but the lifetimes of such refnames is not guaranteed by the refs API. Bring consistency to this mess by changing object_array to make its own copy for the object_array_entry::name field and free this memory when an object_array_entry is deleted from the array. Many callers were passing the empty string as the name parameter, so as a performance optimization, treat the empty string specially. Instead of making a copy, store a pointer to a statically-allocated empty string to object_array_entry::name. When deleting such an entry, skip the free(). Change the callers that were already passing copies to add_object_array_with_mode() to either skip the copy, or (if the memory needed to be allocated anyway) freeing the memory itself. A part of this commit effectively reverts `70d26c6e76` read_revisions_from_stdin: make copies for handle_revision_arg because the copying introduced by that commit (which is still necessary) is now done at a deeper level. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-02 15:28:46 -07:00
Michael Haggerty	be6754c67f	revision: use object_array_filter() in implementation of gc_boundary() Use object_array_filter(), which will soon be made smarter about cleaning up discarded entries properly. Also add a function comment. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-28 09:25:01 -07:00
Michael Haggerty	ff5f5f268f	revision: split some overly-long lines Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-28 09:25:01 -07:00
Michael Haggerty	df835d3a0c	add_rev_cmdline(): make a copy of the name argument Instead of assuming that the memory pointed to by the name argument will live forever, make a local copy of it before storing it in the ref_cmdline_info. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-28 09:25:00 -07:00
Kevin Bracey	141efdba57	revision.c: make default history consider bottom commits Previously, the default history treated bottom commits the same as any other UNINTERESTING commit, which could force it down side branches. Consider the following history: A--B---D--F marks !TREESAME parent paths \ /* `-C-' When requesting "B..F", B is UNINTERESTING but TREESAME to D. C is !UNINTERESTING. So default following would go from D into the irrelevant side branch C to A, rather than to B. Note also that if there had been an extra !UNINTERESTING commit B1 between B and D, it wouldn't have gone down C. Change the default following to test relevant_commit() instead of !UNINTERESTING, so it can proceed straight from D to B, thus finishing the traversal of that path. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:51:10 -07:00
Kevin Bracey	bf3418b08b	revision.c: don't show all merges for --parents When using --parents or --children, get_commit_action() previously showed all merges, even if TREESAME to both parents. This was intended to tie together the topology of the rewritten parents, but it was excessive - in fact we only need to show merges that have two or more relevant parents. Merges at the boundary do not necessarily need to be shown. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:51:10 -07:00
Kevin Bracey	4d826608e9	revision.c: discount side branches when computing TREESAME Use the BOTTOM flag to define relevance for pruning. Relevant commits are those that are !UNINTERESTING or BOTTOM, and this allows us to identify irrelevant side branches (UNINTERESTING && !BOTTOM). If a merge has relevant parents, and it is TREESAME to them, then do not let irrelevant parents cause the merge to be treated as !TREESAME. When considering simplification, don't always include all merges - merges with exactly one relevant parent can be simplified, if TREESAME according to the above rule. These two changes greatly increase simplification in limited, pruned revision lists. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:51:10 -07:00
Kevin Bracey	7f34a46ff5	revision.c: add BOTTOM flag for commits When performing edge-based operations on the revision graph, it can be useful to be able to identify the INTERESTING graph's connection(s) to the bottom commit(s) specified by the user. Conceptually when the user specifies "A..B" (== B ^A), they are asking for the history from A to B. The first connection from A onto the INTERESTING graph is part of that history, and should be considered. If we consider only INTERESTING nodes and their connections, then we're really only considering the history from A's immediate descendants to B. This patch does not change behaviour, but adds a new BOTTOM flag to indicate the bottom commits specified by the user, ready to be used by following patches. We immediately use the BOTTOM flag to return collect_bottom_commits() to its original approach of examining the pending commit list rather than the command line. This will ensure alignment of the definition of "bottom" with future patches. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:51:10 -07:00
Kevin Bracey	143f1eafdb	simplify-merges: drop merge from irrelevant side branch Reimplement commit `4b7f53da` on top of the new simplify-merges infrastructure, tightening the condition to only consider root parents; the original version incorrectly dropped parents that were TREESAME to anything. Original log message follows. The merge simplification rule stated in `6546b59` (revision traversal: show full history with merge simplification, 2008-07-31) still treated merge commits too specially. Namely, in a history with this shape: ---o---o---M / x---x---x where three 'x' were on a history completely unrelated to the main history 'o' and do not touch any of the paths we are following, we still said that after simplifying all of the parents of M, 'x' (which is the leftmost 'x' that rightmost 'x simplifies down to) and 'o' (which would be the last commit on the main history that touches the paths we are following) are independent from each other, and both need to be kept. That is incorrect; when the side branch 'x' never touches the paths, it should be removed to allow M to simplify down to the last commit on the main history that touches the paths. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:51:09 -07:00
Kevin Bracey	9c129eab99	simplify-merges: never remove all TREESAME parents When simplifying an odd merge, such as one that used "-s ours", we may find ourselves TREESAME to apparently redundant parents. Prevent simplify_merges() from removing every TREESAME parent; if this would happen reinstate the first TREESAME parent - the one that the default log would have followed. This avoids producing a totally disjoint history from the default log when the default log is a better explanation of the end result, and aids visualisation of odd merges. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:51:09 -07:00
Kevin Bracey	d0af663e42	revision.c: Make --full-history consider more merges History simplification previously always treated merges as TREESAME if they were TREESAME to any parent. While this was consistent with the default behaviour, this could be extremely unhelpful when searching detailed history, and could not be overridden. For example, if a merge had ignored a change, as if by "-s ours", then: git log -m -p --full-history -Schange file would successfully locate "change"'s addition but would not locate the merge that resolved against it. Futher, simplify_merges could drop the actual parent that a commit was TREESAME to, leaving it as a normal commit marked TREESAME that isn't actually TREESAME to its remaining parent. Now redefine a commit's TREESAME flag to be true only if a commit is TREESAME to _all_ of its parents. This doesn't affect either the default simplify_history behaviour (because partially TREESAME merges are turned into normal commits), or full-history with parent rewriting (because all merges are output). But it does affect other modes. The clearest difference is that --full-history will show more merges - sufficient to ensure that -m -p --full-history log searches can really explain every change to the file, including those changes' ultimate fate in merges. Also modify simplify_merges to recalculate TREESAME after removing a parent. This is achieved by storing per-parent TREESAME flags on the initial scan, so the combined flag can be easily recomputed. This fixes some t6111 failures, but creates a couple of new ones - we are now showing some merges that don't need to be shown. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:51:09 -07:00
Kevin Bracey	a765499a08	revision.c: treat A...B merge bases as if manually specified The documentation assures users that "A...B" is defined as "A B --not $(git merge-base --all A B)". This wasn't in fact quite true, because the calculated merge bases were not sent to add_rev_cmdline(). The main effect of this was that although git rev-list --ancestry-path A B --not $(git merge-base --all A B) worked, the simpler form git rev-list --ancestry-path A...B failed with a "no bottom commits" error. Other potential users of bottom commits could also be affected by this problem, if they examine revs->cmdline_info; I came across the issue in my proposed history traversal refinements series. So ensure that the calculated merge bases are sent to add_rev_cmdline(), flagged with new 'whence' enum value REV_CMD_MERGE_BASE. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-16 11:45:34 -07:00
Junio C Hamano	e52e6f79cc	Merge branch 'nd/pretty-formats' pretty-printing body of the commit that is stored in non UTF-8 encoding did not work well. The early part of this series fixes it. And then it adds %C(auto) specifier that turns the coloring on when we are emitting to the terminal, and adds column-aligning format directives. * nd/pretty-formats: pretty: support %>> that steal trailing spaces pretty: support truncating in %>, %< and %>< pretty: support padding placeholders, %< %> and %>< pretty: add %C(auto) for auto-coloring pretty: split color parsing into a separate function pretty: two phase conversion for non utf-8 commits utf8.c: add reencode_string_len() that can handle NULs in string utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences utf8.c: move display_mode_esc_sequence_len() for use by other functions pretty: share code between format_decoration and show_decorations pretty-formats.txt: wrap long lines pretty: get the correct encoding for --pretty:format=%e pretty: save commit encoding from logmsg_reencode if the caller needs it	2013-04-23 11:22:48 -07:00
Junio C Hamano	8d41addacb	Merge branch 'tr/copy-revisions-from-stdin' A fix to a long-standing issue in the command line parser for revisions, which was triggered by mv/sequence-pick-error-diag topic. * tr/copy-revisions-from-stdin: read_revisions_from_stdin: make copies for handle_revision_arg	2013-04-19 13:40:13 -07:00
Nguyễn Thái Ngọc Duy	5a10d23658	pretty: save commit encoding from logmsg_reencode if the caller needs it The commit encoding is parsed by logmsg_reencode, there's no need for the caller to re-parse it again. The reencoded message now has the new encoding, not the original one. The caller would need to read commit object again before parsing. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-18 16:28:27 -07:00
Thomas Rast	70d26c6e76	read_revisions_from_stdin: make copies for handle_revision_arg read_revisions_from_stdin() has passed pointers to its read buffer down to handle_revision_arg() since its inception way back in `42cabc3` (Teach rev-list an option to read revs from the standard input., 2006-09-05). Even back then, this was a bug: through add_pending_object, the argument was recorded in the object_array's 'name' field. Fix it by making a copy whenever read_revisions_from_stdin() passes an argument down the callchain. The other caller runs handle_revision_arg() on argv[], where it would be redundant to make a copy. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-16 11:17:48 -07:00
Junio C Hamano	0290bf1250	Revert `4b7f53da76` (simplify-merges: drop merge from irrelevant side branch, 2013-01-17) Kevin Bracey reports that the change regresses a case shown in the user manual. Trading one fix with another breakage is not worth it. Just keep the test to document the existing breakage, and revert the change for now. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-08 13:10:27 -07:00
Junio C Hamano	92e0d91632	Sync with 1.8.1 maintenance track * maint-1.8.1: Start preparing for 1.8.1.6 git-tag(1): we tag HEAD by default Fix revision walk for commits with the same dates t2003: work around path mangling issue on Windows pack-refs: add fully-peeled trait pack-refs: write peeled entry for non-tags use parse_object_or_die instead of die("bad object") avoid segfaults on parse_object failure entry: fix filter lookup t2003: modernize style name-hash.c: fix endless loop with core.ignorecase=true	2013-04-03 09:18:01 -07:00
Junio C Hamano	64379806a9	Merge branch 'kk/revwalk-slop-too-many-commit-within-a-second' into maint-1.8.1 * kk/revwalk-slop-too-many-commit-within-a-second: Fix revision walk for commits with the same dates	2013-04-03 08:44:02 -07:00
Junio C Hamano	74bd52681d	Merge branch 'kk/revwalk-slop-too-many-commit-within-a-second' Allow the revision "slop" code to look deeper while commits with exactly the same timestamps come next to each other (which can often happen after a large "am" and "rebase" session). * kk/revwalk-slop-too-many-commit-within-a-second: Fix revision walk for commits with the same dates	2013-03-28 14:38:25 -07:00
Junio C Hamano	436b60ce7a	Merge branch 'jc/remove-treesame-parent-in-simplify-merges' The --simplify-merges logic did not cull irrelevant parents from a merge that is otherwise not interesting with respect to the paths we are following. This touches a fairly core part of the revision traversal infrastructure; even though I think this change is correct, please report immediately if you find any unintended side effect. * jc/remove-treesame-parent-in-simplify-merges: simplify-merges: drop merge from irrelevant side branch	2013-03-28 14:37:53 -07:00
Thomas Rast	12da1d1f6f	Implement line-history search (git log -L) This is a rewrite of much of Bo's work, mainly in an effort to split it into smaller, easier to understand routines. The algorithm is built around the struct range_set, which encodes a series of line ranges as intervals [a,b). This is used in two contexts: * A set of lines we are tracking (which will change as we dig through history). * To encode diffs, as pairs of ranges. The main routine is range_set_map_across_diff(). It processes the diff between a commit C and some parent P. It determines which diff hunks are relevant to the ranges tracked in C, and computes the new ranges for P. The algorithm is then simply to process history in topological order from newest to oldest, computing ranges and (partial) diffs. At branch points, we need to merge the ranges we are watching. We will find that many commits do not affect the chosen ranges, and mark them TREESAME (in addition to those already filtered by pathspec limiting). Another pass of history simplification then gets rid of such commits. This is wired as an extra filtering pass in the log machinery. This currently only reduces code duplication, but should allow for other simplifications and options to be used. Finally, we hook a diff printer into the output chain. Ideally we would wire directly into the diff logic, to optionally use features like word diff. However, that will require some major reworking of the diff chain, so we completely replace the output with our own diff for now. As this was a GSoC project, and has quite some history by now, many people have helped. In no particular order, thanks go to Jakub Narebski <jnareb@gmail.com> Jens Lehmann <Jens.Lehmann@web.de> Jonathan Nieder <jrnieder@gmail.com> Junio C Hamano <gitster@pobox.com> Ramsay Jones <ramsay@ramsay1.demon.co.uk> Will Palmer <wmpalmer@gmail.com> Apologies to everyone I forgot. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-03-28 10:29:22 -07:00
Bo Yang	c7edcae06e	Export rewrite_parents() for 'log -L' The function rewrite_one is used to rewrite a single parent of the current commit, and is used by rewrite_parents to rewrite all the parents. Decouple the dependence between them by making rewrite_one a callback function that is passed to rewrite_parents. Then export rewrite_parents for reuse by the line history browser. We will use this function in line-log.c. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-03-28 10:29:10 -07:00
Kacper Kornet	c19d1b4e84	Fix revision walk for commits with the same dates Logic in still_interesting function allows to stop the commits traversing if the oldest processed commit is not older then the youngest commit on the list to process and the list contains only commits marked as not interesting ones. It can be premature when dealing with a set of coequal commits. For example git rev-list A^! --not B provides wrong answer if all commits in the range A..B had the same commit time and there are more then 7 of them. To fix this problem the relevant part of the logic in still_interesting is changed to: the walk can be stopped if the oldest processed commit is younger then the youngest commit on the list to processed. Signed-off-by: Kacper Kornet <draenog@pld-linux.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-03-22 16:15:48 -07:00
Jeff King	04deccda11	log: re-encode commit messages before grepping If you run "git log --grep=foo", we will run your regex on the literal bytes of the commit message. This can provide confusing results if the commit message is not in the same encoding as your grep expression (or worse, you have commits in multiple encodings, in which case your regex would need to be written to match either encoding). On top of this, we might also be grepping in the commit's notes, which are already re-encoded, potentially leading to grepping in a buffer with mixed encodings concatenated. This is insanity, but most people never noticed, because their terminal and their commit encodings all match. Instead, let's massage the to-be-grepped commit into a standardized encoding. There is not much point in adding a flag for "this is the encoding I expect my grep pattern to match"; the only sane choice is for it to use the log output encoding. That is presumably what the user's terminal is using, and it means that the patterns found by the grep will match the output produced by git. As a bonus, this fixes a potential segfault in commit_match when commit->buffer is NULL, as we now build on logmsg_reencode, which handles reading the commit buffer from disk if necessary. The segfault can be triggered with: git commit -m 'text1' --allow-empty git commit -m 'text2' --allow-empty git log --graph --no-walk --grep 'text2' which arguably does not make any sense (--graph inherently wants a connected history, and by --no-walk the command line is telling us to show discrete points in history without connectivity), and we probably should forbid the combination, but that is a separate issue. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-11 13:11:45 -08:00
Junio C Hamano	4b7f53da76	simplify-merges: drop merge from irrelevant side branch The merge simplification rule stated in `6546b59` (revision traversal: show full history with merge simplification, 2008-07-31) still treated merge commits too specially. Namely, in a history with this shape: ---o---o---M / x---x---x where three 'x' were on a history completely unrelated to the main history 'o' and do not touch any of the paths we are following, we still said that after simplifying all of the parents of M, 'x' (which is the leftmost 'x' that rightmost 'x simplifies down to) and 'o' (which would be the last commit on the main history that touches the paths we are following) are independent from each other, and both need to be kept. That is incorrect; when the side branch 'x' never touches the paths, it should be removed to allow M to simplify down to the last commit on the main history that touches the paths. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-17 15:22:48 -08:00
Junio C Hamano	df874fa82e	log --use-mailmap: optimize for cases without --author/--committer search When we taught the commit_match() mechanism to pay attention to the new --use-mailmap option, we started to unconditionally copy the commit object to a temporary buffer, just in case we need the author and committer lines updated via the mailmap mechanism, and rewrite author and committer using the mailmap. It turns out that this has a rather unpleasant performance implications. In the linux kernel repository, running $ git log --author='Junio C Hamano' --pretty=short >/dev/null under /usr/bin/time, with and without --use-mailmap (the .mailmap file is 118 entries long, the particular author does not appear in it), cost (with warm cache): [without --use-mailmap] 5.42user 0.26system 0:05.70elapsed 99%CPU (0avgtext+0avgdata 2005936maxresident)k 0inputs+0outputs (0major+137669minor)pagefaults 0swaps [with --use-mailmap] 6.47user 0.30system 0:06.78elapsed 99%CPU (0avgtext+0avgdata 2006288maxresident)k 0inputs+0outputs (0major+137692minor)pagefaults 0swaps which incurs about 20% overhead. The command is doing extra work, so the extra cost may be justified. But it is inexcusable to pay the cost when we do not need author/committer match. In the same repository, $ git log --grep='fix menuconfig on debian lenny' --pretty=short >/dev/null shows very similar numbers as the above: [without --use-mailmap] 5.32user 0.30system 0:05.63elapsed 99%CPU (0avgtext+0avgdata 2005984maxresident)k 0inputs+0outputs (0major+137672minor)pagefaults 0swaps [with --use-mailmap] 6.64user 0.24system 0:06.89elapsed 99%CPU (0avgtext+0avgdata 2006320maxresident)k 0inputs+0outputs (0major+137694minor)pagefaults 0swaps The latter case is an unnecessary performance regression. We may want to _show_ the result with mailmap applied, but we do not have to copy and rewrite the author/committer of all commits we try to match if we do not query for these fields. Trivially optimize this performace regression by limiting the rewrites for only when we are matching with author/committer fields. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-10 12:33:09 -08:00
Antoine Pelisse	d72fbe8111	log: grep author/committer using mailmap Currently you can use mailmap to display log authors and committers but you can't use the mailmap to find commits with mapped values. This commit allows you to run: git log --use-mailmap --author mapped_name_or_email git log --use-mailmap --committer mapped_name_or_email Of course it only works if the --use-mailmap option is used. The new name and email are copied only when necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-10 12:33:08 -08:00
Junio C Hamano	4ad4fce63a	Merge branch 'jc/prettier-pretty-note' Emit the notes attached to the commit in "format-patch --notes" output after three-dashes. * jc/prettier-pretty-note: format-patch: add a blank line between notes and diffstat Doc User-Manual: Patch cover letter, three dashes, and --notes Doc format-patch: clarify --notes use case Doc notes: Include the format-patch --notes option Doc SubmittingPatches: Mention --notes option after "cover letter" Documentation: decribe format-patch --notes format-patch --notes: show notes after three-dashes format-patch: append --signature after notes pretty_print_commit(): do not append notes message pretty: prepare notes message at a centralized place format_note(): simplify API pretty: remove reencode_commit_message()	2012-11-15 10:25:05 -08:00
Junio C Hamano	76141e2e62	format_note(): simplify API We either stuff the notes message without modification for %N userformat, or format it for human consumption. Using two bits is an overkill that does not benefit anybody. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-17 22:42:40 -07:00
Junio C Hamano	727b6fc3ed	log --grep: accept --basic-regexp and --perl-regexp When we added the "--perl-regexp" option (or "-P") to "git grep", we should have done the same for the commands in the "git log" family, but somehow we forgot to do so. This corrects it, but we will reserve the short-and-sweet "-P" option for something else for now. Also introduce the "--basic-regexp" option for completeness, so that the "last one wins" principle can be used to defeat an earlier -E option, e.g. "git log -E --basic-regexp --grep='<bre>'". Note that it cannot have the short "-G" option as the option is to grep in the patch text in the context of "log" family. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-09 23:21:30 -07:00
Junio C Hamano	34a4ae55b2	log --grep: use the same helper to set -E/-F options as "git grep" The command line option parser for "git log -F -E --grep='<ere>'" did not flip the "fixed" bit, violating the general "last option wins" principle among conflicting options. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-09 23:21:29 -07:00
Junio C Hamano	918d4e1c90	revisions: initialize revs->grep_filter using grep_init() Instead of using the hand-rolled initialization sequence, use grep_init() to populate the necessary bits. This opens the door to allow the calling commands to optionally read grep.* configuration variables via git_config() if they want to. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-09 23:21:29 -07:00
Junio C Hamano	31d69db340	Merge branch 'jc/maint-log-grep-all-match-1' into maint * jc/maint-log-grep-all-match-1: grep.c: make two symbols really file-scope static this time t7810-grep: test --all-match with multiple --grep and --author options t7810-grep: test interaction of multiple --grep and --author options t7810-grep: test multiple --author with --all-match t7810-grep: test multiple --grep with and without --all-match t7810-grep: bring log --grep tests in common form grep.c: mark private file-scope symbols as static log: document use of multiple commit limiting options log --grep/--author: honor --all-match honored for multiple --grep patterns grep: show --debug output only once grep: teach --debug option to dump the parse tree	2012-09-29 22:30:56 -07:00
Nguyễn Thái Ngọc Duy	38cfe915bf	revision: make --grep search in notes too if shown Notes are shown after commit body. From user perspective it looks pretty much like commit body and they may assume --grep would search in that part too. Make it so. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-09-29 12:15:05 -07:00
Junio C Hamano	baa6378ff2	log --grep-reflog: reject the option without -g Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-09-29 12:07:04 -07:00
Nguyễn Thái Ngọc Duy	72fd13f71c	revision: add --grep-reflog to filter commits by reflog messages Similar to --author/--committer which filters commits by author and committer header fields. --grep-reflog adds a fake "reflog" header to commit and a grep filter to search on that line. All rules to --author/--committer apply except no timestamp stripping. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-09-29 11:41:14 -07:00
Junio C Hamano	3d7535e424	Merge branch 'jc/maint-log-grep-all-match' Fix a long-standing bug in "git log --grep" when multiple "--grep" are used together with "--all-match" and "--author" or "--committer". * jc/maint-log-grep-all-match: t7810-grep: test --all-match with multiple --grep and --author options t7810-grep: test interaction of multiple --grep and --author options t7810-grep: test multiple --author with --all-match t7810-grep: test multiple --grep with and without --all-match t7810-grep: bring log --grep tests in common form grep.c: mark private file-scope symbols as static log: document use of multiple commit limiting options log --grep/--author: honor --all-match honored for multiple --grep patterns grep: show --debug output only once grep: teach --debug option to dump the parse tree	2012-09-18 14:37:54 -07:00
Junio C Hamano	78ed88d80a	Merge branch 'mz/cherry-pick-cmdline-order' into maint * mz/cherry-pick-cmdline-order: cherry-pick/revert: respect order of revisions to pick demonstrate broken 'git cherry-pick three one two' teach log --no-walk=unsorted, which avoids sorting	2012-09-14 21:24:18 -07:00
Junio C Hamano	17bf35a3c7	grep: teach --debug option to dump the parse tree Our "grep" allows complex boolean expressions to be formed to match each individual line with operators like --and, '(', ')' and --not. Introduce the "--debug" option to show the parse tree to help people who want to debug and enhance it. Also "log" learns "--grep-debug" option to do the same. The command line parser to the log family is a lot more limited than the general "git grep" parser, but it has special handling for header matching (e.g. "--author"), and a parse tree is valuable when working on it. Note that "--all-match" is not any individual node in the parse tree. It is an instruction to the evaluator to check all the nodes in the top-level backbone have matched and reject a document as non-matching otherwise. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-09-14 10:10:35 -07:00
Junio C Hamano	3503e9ab32	Merge branch 'maint-1.7.11' into maint	2012-09-12 14:08:05 -07:00
Junio C Hamano	eaff724bbc	Merge branch 'jc/dotdot-is-parent-directory' into maint-1.7.11 "git log .." errored out saying it is both rev range and a path when there is no disambiguating "--" is on the command line. Update the command line parser to interpret ".." as a path in such a case. * jc/dotdot-is-parent-directory: specifying ranges: we did not mean to make ".." an empty set	2012-09-12 14:00:34 -07:00
Junio C Hamano	1c88a6d174	Sync with 1.7.11.6 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-09-11 11:23:54 -07:00
Junio C Hamano	738c218760	Merge branch 'tr/void-diff-setup-done' into maint-1.7.11 * tr/void-diff-setup-done: diff_setup_done(): return void	2012-09-11 10:53:40 -07:00
Junio C Hamano	c2b927932d	Merge branch 'mz/cherry-pick-cmdline-order' "git cherry-pick A C B" used to replay changes in A and then B and then C if these three commits had committer timestamps in that order, which is not what the user who said "A C B" naturally expects. * mz/cherry-pick-cmdline-order: cherry-pick/revert: respect order of revisions to pick demonstrate broken 'git cherry-pick three one two' teach log --no-walk=unsorted, which avoids sorting	2012-09-10 15:42:55 -07:00
Junio C Hamano	e3f26752b5	Merge branch 'maint-1.7.11' into maint * maint-1.7.11: Almost 1.7.11.6 gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO rebase -i: use full onto sha1 in reflog sh-setup: protect from exported IFS receive-pack: do not leak output from auto-gc to standard output t/t5400: demonstrate breakage caused by informational message from prune setup: clarify error messages for file/revisions ambiguity send-email: improve RFC2047 quote parsing fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value	2012-09-10 15:31:06 -07:00
Junio C Hamano	03adeeaad6	Merge branch 'jk/maint-null-in-trees' into maint-1.7.11 "git diff" had a confusion between taking data from a path in the working tree and taking data from an object that happens to have name 0{40} recorded in a tree. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value	2012-09-10 15:24:54 -07:00
Junio C Hamano	7764a3b35c	Merge branch 'jc/dotdot-is-parent-directory' "git log .." errored out saying it is both rev range and a path when there is no disambiguating "--" is on the command line. Update the command line parser to interpret ".." as a path in such a case. * jc/dotdot-is-parent-directory: specifying ranges: we did not mean to make ".." an empty set	2012-09-07 11:09:18 -07:00
Martin von Zweigbergk	ca92e59e30	teach log --no-walk=unsorted, which avoids sorting When 'git log' is passed the --no-walk option, no revision walk takes place, naturally. Perhaps somewhat surprisingly, however, the provided revisions still get sorted by commit date. So e.g 'git log --no-walk HEAD HEAD~1' and 'git log --no-walk HEAD~1 HEAD' give the same result (unless the two revisions share the commit date, in which case they will retain the order given on the command line). As the commit that introduced --no-walk (`8e64006` (Teach revision machinery about --no-walk, 2007-07-24)) points out, the sorting is intentional, to allow things like git log --abbrev-commit --pretty=oneline --decorate --all --no-walk to show all refs in order by commit date. But there are also other cases where the sorting is not wanted, such as <command producing revisions in order> \| git log --oneline --no-walk --stdin To accomodate both cases, leave the decision of whether or not to sort up to the caller, by allowing --no-walk={sorted,unsorted}, defaulting to 'sorted' for backward-compatibility reasons. Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-08-30 12:26:50 -07:00
Junio C Hamano	3b753148b6	Merge branch 'jk/maint-null-in-trees' We do not want a link to 0{40} object stored anywhere in our objects. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value	2012-08-27 11:54:28 -07:00
Junio C Hamano	003c84f6d2	specifying ranges: we did not mean to make ".." an empty set Either end of revision range operator can be omitted to default to HEAD, as in "origin.." (what did I do since I forked) or "..origin" (what did they do since I forked). But the current parser interprets ".." as an empty range "HEAD..HEAD", and worse yet, because ".." does exist on the filesystem, we get this annoying output: $ cd Documentation/howto $ git log .. ;# give me recent commits that touch Documentation/ area. fatal: ambiguous argument '..': both revision and filename Use '--' to separate filenames from revisions Surely we could say "git log ../" or even "git log -- .." to disambiguate, but we shouldn't have to. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-08-23 14:37:49 -07:00
Junio C Hamano	9cd33bbc52	Merge branch 'tr/void-diff-setup-done' Remove unnecessary code. * tr/void-diff-setup-done: diff_setup_done(): return void	2012-08-22 11:52:27 -07:00
Thomas Rast	28452655af	diff_setup_done(): return void diff_setup_done() has historically returned an error code, but lost the last nonzero return in `943d5b7` (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-08-03 12:11:07 -07:00
Jeff King	e54501004a	diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-29 15:04:32 -07:00
Junio C Hamano	d05e56ea67	Merge branch 'jk/revision-walk-stop-at-max-count' "git log -n 1 -- rarely-touched-path" was spending unnecessary cycles after showing the first change to find the next one, only to discard it. * jk/revision-walk-stop-at-max-count: revision: avoid work after --max-count is reached	2012-07-22 12:56:30 -07:00
Junio C Hamano	0958a24d73	Merge branch 'jc/sha1-name-more' Teaches the object name parser things like a "git describe" output is always a commit object, "A" in "git log A" must be a committish, and "A" and "B" in "git log A...B" both must be committish, etc., to prolong the lifetime of abbreviated object names. * jc/sha1-name-more: (27 commits) t1512: match the "other" object names t1512: ignore whitespaces in wc -l output rev-parse --disambiguate=<prefix> rev-parse: A and B in "rev-parse A..B" refer to committish reset: the command takes committish commit-tree: the command wants a tree and commits apply: --build-fake-ancestor expects blobs sha1_name.c: add support for disambiguating other types revision.c: the "log" family, except for "show", takes committish revision.c: allow handle_revision_arg() to take other flags sha1_name.c: introduce get_sha1_committish() sha1_name.c: teach lookup context to get_sha1_with_context() sha1_name.c: many short names can only be committish sha1_name.c: get_sha1_1() takes lookup flags sha1_name.c: get_describe_name() by definition groks only commits sha1_name.c: teach get_short_sha1() a commit-only option sha1_name.c: allow get_short_sha1() to take other flags get_sha1(): fix error status regression sha1_name.c: restructure disambiguation of short names sha1_name.c: correct misnamed "canonical" and "res" ...	2012-07-22 12:55:07 -07:00
Jeff King	b72a1904ae	revision: avoid work after --max-count is reached During a revision traversal in which --max-count has been specified, we decrement a counter for each revision returned by get_revision. When it hits 0, we typically return NULL (the exception being if we still have boundary commits to show). However, before we check the counter, we call get_revision_1 to get the next commit. This might involve looking at a large number of commits if we have restricted the traversal (e.g., we might traverse until we find the next commit whose diff actually matches a pathspec). There's no need to make this get_revision_1 call when our counter runs out. If we are not in --boundary mode, we will just throw away the result and immediately return NULL. If we are in --boundary mode, then we will still throw away the result, and then start showing the boundary commits. However, as git_revision_1 does not impact the boundary list, it should not have an impact. In most cases, avoiding this work will not be especially noticeable. However, in some cases, it can make a big difference: [before] $ time git rev-list -1 origin Documentation/RelNotes/1.7.11.2.txt `8d141a1d56` real 0m0.301s user 0m0.280s sys 0m0.016s [after] $ time git rev-list -1 origin Documentation/RelNotes/1.7.11.2.txt `8d141a1d56` real 0m0.010s user 0m0.008s sys 0m0.000s Note that the output is produced almost instantaneously in the first case, and then git uselessly spends a long time looking for the next commit to touch that file (but there isn't one, and we traverse all the way down to the roots). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-13 13:51:29 -07:00
Junio C Hamano	c8382c1500	Merge branch 'jc/rev-list-simplify-merges-first-parent' into maint When "git log" gets "--simplify-merges/by-decoration" together with "--first-parent", the combination of these options makes the simplification logic to use in-core commit objects that haven't been examined for relevance, either producing incorrect result or taking too long to produce any output. Teach the simplification logic to ignore commits that the first-parent traversal logic ignored when both are in effect to work around the issue. * jc/rev-list-simplify-merges-first-parent: revision: ignore side parents while running simplify-merges revision: note the lack of free() in simplify_merges() revision: "simplify" options imply topo-order sort	2012-07-11 12:46:57 -07:00
Junio C Hamano	9ca724933a	Merge branch 'mm/verify-filename-fix' into maint "git diff COPYING HEAD:COPYING" gave a nonsense error message that claimed that the treeish HEAD did not have COPYING in it. * mm/verify-filename-fix: verify_filename(): ask the caller to chose the kind of diagnosis sha1_name: do not trigger detailed diagnosis for file arguments	2012-07-11 12:45:49 -07:00
Junio C Hamano	d5f6b1d756	revision.c: the "log" family, except for "show", takes committish Add a field to setup_revision_opt structure and allow these callers to tell the setup_revisions command parsing machinery that short SHA1 it encounters are meant to name committish. This step does not go all the way to connect the setup_revisions() to sha1_name.c yet. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-09 16:42:22 -07:00
Junio C Hamano	8e676e8ba5	revision.c: allow handle_revision_arg() to take other flags The existing "cant_be_filename" that tells the function that the caller knows the arg is not a path (hence it does not have to be checked for absense of the file whose name matches it) is made into a bit in the flag word. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-09 16:42:22 -07:00
Junio C Hamano	cd74e4733d	sha1_name.c: introduce get_sha1_committish() Many callers know that the user meant to name a committish by syntactical positions where the object name appears. Calling this function allows the machinery to disambiguate shorter-than-unique abbreviated object names between committish and others. Note that this does NOT error out when the named object is not a committish. It is merely to give a hint to the disambiguation machinery. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-09 16:42:22 -07:00
Junio C Hamano	33bd598c39	sha1_name.c: teach lookup context to get_sha1_with_context() The function takes user input string and returns the object name (binary SHA-1) with mode bits and path when the object was looked up in a tree. Additionally give hints to help disambiguation of abbreviated object names when the caller knows what it is looking for. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-09 16:42:22 -07:00
Junio C Hamano	249c8f4a16	sha1_name.c: get rid of get_sha1_with_mode() There are only two callers, and they will benefit from being able to pass disambiguation hints to underlying get_sha1_with_context() API once it happens. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-03 10:24:11 -07:00
Junio C Hamano	331512f988	Merge branch 'jc/rev-list-simplify-merges-first-parent' When "--simplify-merges/by-decoration" is given together with "--first-parent" to "git log", the combination of these options makes the simplification logic to use in-core commit objects that haven't been examined for relevance, either producing incorrect result or taking too long to produce any output. Teach the simplification logic to ignore commits that the first-parent traversal logic ignored when both are in effect to work around the issue.	2012-06-28 15:20:16 -07:00
Junio C Hamano	08080894b7	Merge branch 'mm/verify-filename-fix' "git diff COPYING HEAD:COPYING" gave a nonsense error message that claimed that the treeish HEAD did not have COPYING in it.	2012-06-28 15:19:32 -07:00
Matthieu Moy	023e37c377	verify_filename(): ask the caller to chose the kind of diagnosis verify_filename() can be called in two different contexts. Either we just tried to interpret a string as an object name, and it fails, so we try looking for a working tree file (i.e. we finished looking at revs that come earlier on the command line, and the next argument must be a pathname), or we _know_ that we are looking for a pathname, and shouldn't even try interpreting the string as an object name. For example, with this change, we get: $ git log COPYING HEAD:inexistant fatal: HEAD:inexistant: no such path in the working tree. Use '-- <path>...' to specify paths that do not exist locally. $ git log HEAD:inexistant fatal: Path 'inexistant' does not exist in 'HEAD' Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-06-18 15:21:42 -07:00
Junio C Hamano	6e513ba3a6	revision: ignore side parents while running simplify-merges The simplify_merges() function needs to look at all history chain to find the closest ancestor that is relevant after the simplification, but after --first-parent traversal, side parents haven't been marked for relevance (they are irrelevant by definition due to the nature of first-parent-only traversal) nor culled from the parents list of resulting commits. We cannot simply remove these side parents from the parents list, as the output phase still wants to see the parents. Instead, teach simplify_one() and its callees to ignore the later parents. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-06-13 14:04:33 -07:00
Junio C Hamano	ab9d75a8d7	revision: note the lack of free() in simplify_merges() Among the three similar-looking loops that walk singly linked commit_list, the first one is only peeking and the same list is later used for real work. Leave a comment not to mistakenly free its elements there. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-06-08 15:44:38 -07:00
Junio C Hamano	a52f007113	revision: "simplify" options imply topo-order sort The code internally runs sort_in_topo_order() already; it is more clear to spell it out in the option parsing phase, instead of adding a special case in simplify_merges() function.	2012-06-08 14:47:08 -07:00
Junio C Hamano	6a8989709e	Merge branch 'rs/commit-list-append' There is no need for "commit_list_reverse()" function that only invites inefficient code. By René Scharfe * rs/commit-list-append: commit: remove commit_list_reverse() revision: append to list instead of insert and reverse sequencer: export commit_list_append()	2012-04-29 17:51:30 -07:00
Junio C Hamano	0fe59d7686	Merge branch 'cb/cherry-pick-rev-path-confusion' The command line parser choked "git cherry-pick $name" when $name can be both revision name and a pathname, even though $name can never be a path in the context of the command. The issue the patch addresses is real, but the way it is implemented felt unnecessarily invasive a bit. It may be cleaner for this caller to add the "--" to the end of the argv_array it passes to setup_revisions(). By Clemens Buchacher * cb/cherry-pick-rev-path-confusion: cherry-pick: do not expect file arguments	2012-04-27 13:58:02 -07:00
René Scharfe	2e7da8e9f4	revision: append to list instead of insert and reverse By using commit_list_insert(), we added new items to the top of the list and, since this is not the order we want, reversed it afterwards. Simplify this process by adding new items at the bottom instead, getting rid of the reversal step. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-25 14:51:19 -07:00
Junio C Hamano	419f2ecf78	Merge branch 'hv/submodule-recurse-push' "git push --recurse-submodules" learns to optionally look into the histories of submodules bound to the superproject and push them out. By Heiko Voigt * hv/submodule-recurse-push: push: teach --recurse-submodules the on-demand option Refactor submodule push check to use string list instead of integer Teach revision walking machinery to walk multiple times sequencially	2012-04-24 14:40:20 -07:00
Junio C Hamano	ba8e6326f1	Merge branch 'rs/commit-list-sort-in-batch' Setting up a revision traversal with many starting points was inefficient as these were placed in a date-order priority queue one-by-one. By René Scharfe (3) and Junio C Hamano (1) * rs/commit-list-sort-in-batch: mergesort: rename it to llist_mergesort() revision: insert unsorted, then sort in prepare_revision_walk() commit: use mergesort() in commit_list_sort_by_date() add mergesort() for linked lists	2012-04-23 12:52:55 -07:00
Clemens Buchacher	6d5b93f29f	cherry-pick: do not expect file arguments If a commit-ish passed to cherry-pick or revert happens to have a file of the same name, git complains that the argument is ambiguous and advises to use '--'. To make things worse, the '--' argument is removed by parse_options, und so passing '--' has no effect. Instead, always interpret cherry-pick/revert arguments as revisions. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-15 13:33:31 -07:00
René Scharfe	fbc08ea177	revision: insert unsorted, then sort in prepare_revision_walk() Speed up prepare_revision_walk() by adding commits without sorting to the commit_list and at the end sort the list in one go. Thanks to mergesort() working behind the scenes, this is a lot faster for large numbers of commits than the current insert sort. Also introduce and use commit_list_reverse(), to keep the ordering of commits sharing the same commit date unchanged. That's because commit_list_insert_by_date() sorts commits with descending date, but adds later entries with the same date entries last, while commit_list_insert() always inserts entries at the top. The following commit_list_sort_by_date() keeps the order of entries sharing the same date. Jeff's test case, in a repo with lots of refs, was to run: # make a new commit on top of HEAD, but not yet referenced sha1=`git commit-tree HEAD^{tree} -p HEAD </dev/null` # now do the same "connected" test that receive-pack would do git rev-list --objects $sha1 --not --all With a git.git with a ref for each revision, master needs (best of five): real 0m2.210s user 0m2.188s sys 0m0.016s And with this patch: real 0m0.480s user 0m0.456s sys 0m0.020s Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-11 08:50:54 -07:00
Heiko Voigt	bcc0a3ea38	Teach revision walking machinery to walk multiple times sequencially Previously it was not possible to iterate revisions twice using the revision walking api. We add a reset_revision_walk() which clears the used flags. This allows us to do multiple sequencial revision walks. We add the appropriate calls to the existing submodule machinery doing revision walks. This is done to avoid surprises if future code wants to call these functions more than once during the processes lifetime. Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-30 08:57:49 -07:00
Junio C Hamano	1e4d0875ac	Merge branch 'jc/pickaxe-ignore-case' By Junio C Hamano (2) and Ramsay Jones (1) * jc/pickaxe-ignore-case: ctype.c: Fix a sparse warning pickaxe: allow -i to search in patch case-insensitively grep: use static trans-case table	2012-03-07 12:12:59 -08:00
Junio C Hamano	accccde483	pickaxe: allow -i to search in patch case-insensitively "git log -S<string>" is a useful way to find the last commit in the codebase that touched the <string>. As it was designed to be used by a porcelain script to dig the history starting from a block of text that appear in the starting commit, it never had to look for anything but an exact match. When used by an end user who wants to look for the last commit that removed a string (e.g. name of a variable) that he vaguely remembers, however, it is useful to support case insensitive match. When given the "--regexp-ignore-case" (or "-i") option, which originally was designed to affect case sensitivity of the search done in the commit log part, e.g. "log --grep", the matches made with -S/-G pickaxe search is done case insensitively now. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-02-28 16:15:29 -08:00
Junio C Hamano	6f61eb2017	Merge branch 'jk/grep-binary-attribute' into maint * jk/grep-binary-attribute: grep: pre-load userdiff drivers when threaded grep: load file data after checking binary-ness grep: respect diff attributes for binary-ness grep: cache userdiff_driver in grep_source grep: drop grep_buffer's "name" parameter convert git-grep to use grep_source interface grep: refactor the concept of "grep source" into an object grep: move sha1-reading mutex into low-level code grep: make locking flag global	2012-02-21 14:57:05 -08:00
Junio C Hamano	10439fc0ef	Merge branch 'jk/grep-binary-attribute' * jk/grep-binary-attribute: grep: pre-load userdiff drivers when threaded grep: load file data after checking binary-ness grep: respect diff attributes for binary-ness grep: cache userdiff_driver in grep_source grep: drop grep_buffer's "name" parameter convert git-grep to use grep_source interface grep: refactor the concept of "grep source" into an object grep: move sha1-reading mutex into low-level code grep: make locking flag global	2012-02-14 12:57:18 -08:00
Junio C Hamano	e27d620e91	Merge branch 'jc/maint-log-first-parent-pathspec' into maint * jc/maint-log-first-parent-pathspec: Making pathspec limited log play nicer with --first-parent	2012-02-05 23:58:42 -08:00
Jeff King	c876d6da88	grep: drop grep_buffer's "name" parameter Before the grep_source interface existed, grep_buffer was used by two types of callers: 1. Ones which pulled a file into a buffer, and then wanted to supply the file's name for the output (i.e., git grep). 2. Ones which really just wanted to grep a buffer (i.e., git log --grep). Callers in set (1) should now be using grep_source. Callers in set (2) always pass NULL for the "name" parameter of grep_buffer. We can therefore get rid of this now-useless parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-02-02 10:36:08 -08:00
Junio C Hamano	5a304dd303	Merge branch 'nd/index-pack-no-recurse' * nd/index-pack-no-recurse: index-pack: eliminate unlimited recursion in get_base_data() index-pack: eliminate recursion in find_unresolved_deltas Eliminate recursion in setting/clearing marks in commit list	2012-01-29 13:18:56 -08:00
Junio C Hamano	731218c18f	Merge branch 'jc/maint-log-first-parent-pathspec' * jc/maint-log-first-parent-pathspec: Making pathspec limited log play nicer with --first-parent	2012-01-29 13:18:54 -08:00
Junio C Hamano	36ed1913e1	Making pathspec limited log play nicer with --first-parent In a topic branch workflow, you often want to find the latest commit that merged a side branch that touched a particular area of the system, so that a new topic branch to work on that area can be forked from that commit. For example, I wanted to find an appropriate fork-point to queue Luke's changes related to git-p4 in contrib/fast-import/. "git log --first-parent" traverses the first-parent chain, and "-m --stat" shows the list of paths touched by commits including merge commits. We could ask the question this way: # What is the latest commit that touched that path? $ git log --first-parent --oneline -m --stat master \| sed -e '/^ contrib\/fast-import\/git-p4 /q' \| tail The above finds that `8cbfc11` (Merge branch 'pw/p4-view-updates', 2012-01-06) was such a commit. But a more natural way to spell this question is this: $ git log --first-parent --oneline -m --stat -1 master -- \ contrib/fast-import/git-p4 Unfortunately, this does not work. It finds `ecb7cf9` (git-p4: rewrite view handling, 2012-01-02). This commit is a part of the merged topic branch and is _not_ on the first-parent path from the 'master': $ git show-branch `8cbfc11` `ecb7cf9` ! [`8cbfc11`] Merge branch 'pw/p4-view-updates' ! [`ecb7cf9`] git-p4: rewrite view handling -- - [`8cbfc11`] Merge branch 'pw/p4-view-updates' + [8cbfc11^2] git-p4: view spec documentation ++ [`ecb7cf9`] git-p4: rewrite view handling The problem is caused by the merge simplification logic when it inspects the merge commit `8cbfc11`. In this case, the history leading to the tip of 'master' did not touch git-p4 since 'pw/p4-view-updates' topic forked, and the result of the merge is simply a copy from the tip of the topic branch in the view limited by the given pathspec. The merge simplification logic discards the history on the mainline side of the merge, and pretends as if the sole parent of the merge is its second parent, i.e. the tip of the topic. While this simplification is correct in the general case, it is at least surprising if not outright wrong when the user explicitly asked to show the first-parent history. Here is an attempt to fix this issue, by not allowing us to compare the merge result with anything but the first parent when --first-parent is in effect, to avoid the history traversal veering off to the side branch. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-01-19 16:18:27 -08:00

... 3 4 5 6 7 ...

795 Commits