git-commit-vandalism

Author	SHA1	Message	Date
Jeff King	7d5c960bf6	strbuf_check_ref_format(): expand only local branches This function asks strbuf_branchname() to expand any @-marks in the branchname, and then we blindly stick refs/heads/ in front of the result. This is obviously nonsense if the expansion is "HEAD" or a ref in refs/remotes/. The most obvious end-user effect is that creating or renaming a branch with an expansion may have confusing results (e.g., creating refs/heads/origin/master from "@{upstream}" when the operation should be disallowed). We can fix this by telling strbuf_branchname() that we are only interested in local expansions. Any unexpanded bits are then fed to check_ref_format(), which either disallows them (in the case of "@{upstream}") or lets them through ("refs/heads/@" is technically valid, if a bit silly). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:05:04 -08:00
Jeff King	0e9f62dab9	interpret_branch_name: allow callers to restrict expansions The interpret_branch_name() function converts names like @{-1} and @{upstream} into branch names. The expanded ref names are not fully qualified, and may be outside of the refs/heads/ namespace (e.g., "@" expands to "HEAD", and "@{upstream}" is likely to be in "refs/remotes/"). This is OK for callers like dwim_ref() which are primarily interested in resolving the resulting name, no matter where it is. But callers like "git branch" treat the result as a branch name in refs/heads/. When we expand to a ref outside that namespace, the results are very confusing (e.g., "git branch @" tries to create refs/heads/HEAD, which is nonsense). Callers can't know from the returned string how the expansion happened (e.g., did the user really ask for a branch named "HEAD", or did we do a bogus expansion?). One fix would be to return some out-parameters describing the types of expansion that occurred. This has the benefit that the caller can generate precise error messages ("I understood @{upstream} to mean origin/master, but that is a remote tracking branch, so you cannot create it as a local name"). However, out-parameters make the function interface somewhat cumbersome. Instead, let's do the opposite: let the caller tell us which elements to expand. That's easier to pass in, and none of the callers give more precise error messages than "@{upstream} isn't a valid branch name" anyway (which should be sufficient). The strbuf_branchname() function needs a similar parameter, as most of the callers access interpret_branch_name() through it. We can break the callers down into two groups: 1. Callers that are happy with any kind of ref in the result. We pass "0" here, so they continue to work without restrictions. This includes merge_name(), the reflog handling in add_pending_object_with_path(), and substitute_branch_name(). This last is what powers dwim_ref(). 2. Callers that have funny corner cases (mostly in git-branch and git-checkout). These need to make use of the new parameter, but I've left them as "0" in this patch, and will address them individually in follow-on patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:05:04 -08:00
Jeff King	311fc74826	strbuf_branchname: drop return value The return value from strbuf_branchname() is confusing and useless: it's 0 if the whole name was consumed by an @-mark, but otherwise is the length of the original name we fed. No callers actually look at the return value, so let's just get rid of it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:05:04 -08:00
Jeff King	e322b60d65	interpret_branch_name: move docstring to header file We generally put docstrings with function declarations, because it's the callers who need to know how the function works. Let's do so for interpret_branch_name(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:05:03 -08:00
Jeff King	13228c30a6	interpret_branch_name(): handle auto-namelen for @{-1} The interpret_branch_name() function takes a ptr/len pair for the name, but you can pass "0" for "namelen", which will cause it to check the length with strlen(). However, before we do that auto-namelen magic, we call interpret_nth_prior_checkout(), which gets fed the bogus "0". This was broken by `8cd4249c4` (interpret_branch_name: always respect "namelen" parameter, 2014-01-15). Though to be fair to that commit, it was broken in the _opposite_ direction before, where we would always treat "name" as a string even if a length was passed. You can see the bug with "git log -g @{-1}". That code path always passes "0", and without this patch it cannot figure out which branch's reflog to show. We can fix it by a small reordering of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:04:57 -08:00
brian m. carlson	9461d27240	refs: convert each_reflog_ent_fn to struct object_id Make each_reflog_ent_fn take two struct object_id pointers instead of two pointers to unsigned char. Convert the various callbacks to use struct object_id as well. Also, rename fsck_handle_reflog_sha1 to fsck_handle_reflog_oid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-22 10:12:15 -08:00
Junio C Hamano	e828d33316	Merge branch 'jk/no-looking-at-dotgit-outside-repo' A small code cleanup. * jk/no-looking-at-dotgit-outside-repo: sha1_name: make wraparound of the index into ring-buffer explicit	2016-11-01 12:58:49 -07:00
René Scharfe	3e98919a18	sha1_name: make wraparound of the index into ring-buffer explicit Overflow is defined for unsigned integers, but not for signed ones. Wrap around explicitly for the new ring-buffer in find_unique_abbrev() as we did in `bb84735c` for the ones in sha1_to_hex() and get_pathname(), thus avoiding signed overflows and getting rid of the magic number 3. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-01 10:56:39 -07:00
Junio C Hamano	0d9c527d59	Merge branch 'jk/no-looking-at-dotgit-outside-repo' Update "git diff --no-index" codepath not to try to peek into .git/ directory that happens to be under the current directory, when we know we are operating outside any repository. * jk/no-looking-at-dotgit-outside-repo: diff: handle sha1 abbreviations outside of repository diff_aligned_abbrev: use "struct oid" diff_unique_abbrev: rename to diff_aligned_abbrev find_unique_abbrev: use 4-buffer ring test-*-cache-tree: setup git dir read info/{attributes,exclude} only when in repository	2016-10-27 14:58:48 -07:00
Junio C Hamano	d7ae013a31	Merge branch 'jk/abbrev-auto' Updates the way approximate count of total objects is computed while attempting to come up with a unique abbreviated object name, which in turn needs to estimate how many hexdigits are necessary to ensure uniqueness. * jk/abbrev-auto: find_unique_abbrev: move logic out of get_short_sha1()	2016-10-27 14:58:47 -07:00
Junio C Hamano	580d820ece	Merge branch 'lt/abbrev-auto' Allow the default abbreviation length, which has historically been 7, to scale as the repository grows. The logic suggests to use 12 hexdigits for the Linux kernel, and 9 to 10 for Git itself. * lt/abbrev-auto: abbrev: auto size the default abbreviation abbrev: prepare for new world order abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing	2016-10-27 14:58:47 -07:00
Jeff King	ef2ed5013c	find_unique_abbrev: use 4-buffer ring Some code paths want to format multiple abbreviated sha1s in the same output line. Because we use a single static buffer for our return value, they have to either break their output into several calls or allocate their own arrays and use find_unique_abbrev_r(). Intead, let's mimic sha1_to_hex() and use a ring of several buffers, so that the return value stays valid through multiple calls. This shortens some of the callers, and makes it harder to for them to make a silly mistake. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-26 13:30:51 -07:00
Junio C Hamano	dec040192f	Merge branch 'jk/alt-odb-cleanup' Codepaths involved in interacting alternate object store have been cleaned up. * jk/alt-odb-cleanup: alternates: use fspathcmp to detect duplicates sha1_file: always allow relative paths to alternates count-objects: report alternates via verbose mode fill_sha1_file: write into a strbuf alternates: store scratch buffer as strbuf fill_sha1_file: write "boring" characters alternates: use a separate scratch space alternates: encapsulate alt->base munging alternates: provide helper for allocating alternate alternates: provide helper for adding to alternates list link_alt_odb_entry: refactor string handling link_alt_odb_entry: handle normalize_path errors t5613: clarify "too deep" recursion tests t5613: do not chdir in main process t5613: whitespace/style cleanups t5613: use test_must_fail t5613: drop test_valid_repo function t5613: drop reachable_via function	2016-10-17 13:25:20 -07:00
Jeff King	38dbe5f078	alternates: store scratch buffer as strbuf We pre-size the scratch buffer to hold a loose object filename of the form "xx/yyyy...", which leads to allocation code that is hard to verify. We have to use some magic numbers during the initial allocation, and then writers must blindly assume that the buffer is big enough. Using a strbuf makes it more clear that we cannot overflow. Unfortunately, we do still need some magic numbers to grow our strbuf before calling fill_sha1_path(), but the strbuf growth is much closer to the point of use. This makes it easier to see that it's correct, and opens the possibility of pushing it even further down if fill_sha1_path() learns to work on strbufs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-10 13:52:36 -07:00
Jeff King	597f9134de	alternates: use a separate scratch space The alternate_object_database struct uses a single buffer both for storing the path to the alternate, and as a scratch buffer for forming object names. This is efficient (since otherwise we'd end up storing the path twice), but it makes life hard for callers who just want to know the path to the alternate. They have to remember to stop reading after "alt->name - alt->base" bytes, and to subtract one for the trailing '/'. It would be much simpler if they could simply access a NUL-terminated path string. We could encapsulate this in a function which puts a NUL in the scratch buffer and returns the string, but that opens up questions about the lifetime of the result. The first time another caller uses the alternate, the scratch buffer may get other data tacked onto it. Let's instead just store the root path separately from the scratch buffer. There aren't enough alternates being stored for the duplicated data to matter for performance, and this keeps things simple and safe for the callers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-10 13:52:36 -07:00
Jeff King	7f0fa2c02a	alternates: provide helper for allocating alternate Allocating a struct alternate_object_database is tricky, as we must over-allocate the buffer to provide scratch space, and then put in particular '/' and NUL markers. Let's encapsulate this in a function so that the complexity doesn't leak into callers (and so that we can modify it later). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-10 13:52:36 -07:00
Jeff King	8e3f52d778	find_unique_abbrev: move logic out of get_short_sha1() The get_short_sha1() is only about reading short sha1s; we do call it in a loop to check "is this long enough" for each object, but otherwise it should not need to know about things like our default_abbrev setting. So instead of asking it to set default_automatic_abbrev as a side-effect, let's just have find_unique_abbrev() pick the right place to start its loop. This requires a separate approximate_object_count() function, but that naturally belongs with the rest of sha1_file.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-03 21:03:14 -07:00
Linus Torvalds	e6c587c733	abbrev: auto size the default abbreviation In fairly early days we somehow decided to abbreviate object names down to 7-hexdigits, but as projects grow, it is becoming more and more likely to see such a short object names made in earlier days and recorded in the log messages no longer unique. Currently the Linux kernel project needs 11 to 12 hexdigits, while Git itself needs 10 hexdigits to uniquely identify the objects they have, while many smaller projects may still be fine with the original 7-hexdigit default. One-size does not fit all projects. Introduce a mechanism, where we estimate the number of objects in the repository upon the first request to abbreviate an object name with the default setting and come up with a sane default for the repository. Based on the expectation that we would see collision in a repository with 2^(2N) objects when using object names shortened to first N bits, use sufficient number of hexdigits to cover the number of objects in the repository. Each hexdigit (4-bits) we add to the shortened name allows us to have four times (2-bits) as many objects in the repository. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-03 12:54:29 -07:00
Jeff King	5b33cb1fd7	get_short_sha1: make default disambiguation configurable When we find ambiguous short sha1s, we may get a disambiguation rule from our caller's context. But if we don't, we fall back to treating all sha1s the same, even though most projects will tend to refer only to commits by their short sha1s. This patch introduces a configuration option that lets the user pick a different fallback (e.g., only commits). It's possible that we may want to make this the default, but it's a good idea to start as a config option for two reasons: 1. It lets people experiment with this and see if it's a good idea (i.e., the "tend to" above is an assumption; we don't really know if this will break some obscure cases). 2. Even if we do flip the default, it gives people an escape hatch if it causes problems (you can sometimes override it by asking for "1234^{tree}", but not all combinations are possible). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-27 10:29:56 -07:00
Jeff King	1ffa26c461	get_short_sha1: list ambiguous objects on error When the user gives us an ambiguous short sha1, we print an error and refuse to resolve it. In some cases, the next step is for them to feed us more characters (e.g., if they were retyping or cut-and-pasting from a full sha1). But in other cases, that might be all they have. For example, an old commit message may have used a 7-character hex that was unique at the time, but is now ambiguous. Git doesn't provide any information about the ambiguous objects it found, so it's hard for the user to find out which one they probably meant. This patch teaches get_short_sha1() to list the sha1s of the objects it found, along with a few bits of information that may help the user decide which one they meant. Here's what it looks like on git.git: $ git rev-parse b2e1 error: short SHA1 b2e1 is ambiguous hint: The candidates are: hint: b2e1196 tag v2.8.0-rc1 hint: `b2e11d1` tree hint: `b2e1632` commit 2007-11-14 - Merge branch 'bs/maint-commit-options' hint: b2e1759 blob hint: `b2e18954` blob hint: `b2e1895c` blob fatal: ambiguous argument 'b2e1': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' We show the tagname for tags, and the date and subject for commits. For trees and blobs, in theory we could dig in the history to find the paths at which they were present. But that's very expensive (on the order of 30s for the kernel), and it's not likely to be all that helpful. Most short references are to commits, so the useful information is typically going to be that the object in question _isn't_ a commit. So it's silly to spend a lot of CPU preemptively digging up the path; the user can do it themselves if they really need to. And of course it's somewhat ironic that we abbreviate the sha1s in the disambiguation hint. But full sha1s would cause annoying line wrapping for the commit lines, and presumably the user is going to just re-issue their command immediately with the corrected sha1. We also restrict the list to those that match any disambiguation hint. E.g.: $ git rev-parse b2e1:foo error: short SHA1 b2e1 is ambiguous hint: The candidates are: hint: b2e1196 tag v2.8.0-rc1 hint: `b2e11d1` tree hint: `b2e1632` commit 2007-11-14 - Merge branch 'bs/maint-commit-options' fatal: Invalid object name 'b2e1'. does not bother reporting the blobs, because they cannot work as a treeish. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:55:31 -07:00
Jeff King	fad6b9e590	for_each_abbrev: drop duplicate objects If an object appears multiple times in the object database (e.g., in both loose and packed form, or in two separate packs), the disambiguation machinery may see it more than once. The get_short_sha1() function handles this already, but for_each_abbrev() blindly fires the callback for each instance it finds. We can fix this by collecting the output in a sha1 array and de-duplicating it. As a bonus, the sort done for the de-duplication means that our output will be stable, regardless of the order in which the objects are found. Note that the old code normalized the callback's output to 0/1 to store in the 1-bit ds->ambiguous flag (which both halted the iteration and was returned from the for_each_abbrev function). Now that we are using sha1_array, we can return the real value. In practice, it doesn't matter as the sole caller only ever returns 0. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:46:41 -07:00
Jeff King	0c99171ad2	get_short_sha1: mark ambiguity error for translation This is a human-readable message, and there's no reason it should not be translated. While we're at it, let's drop the period from the end, which is not our usual style. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:46:41 -07:00
Jeff King	59e4e34f69	get_short_sha1: NUL-terminate hex prefix We store the hex prefix in a 40-byte buffer with the prefix itself followed by 40-minus-len "x" characters. These x's serve no purpose, and the lack of NUL termination makes the prefix string annoying to use. Let's just terminate it. Note that this is in contrast to the binary prefix, which _must_ be zero-padded, because we look at the whole thing during a binary search to find the first potential match in each pack index. The loose-object hex search cannot use the same trick because it has to do a linear walk through the unsorted results of readdir() (and even if it could, you'd want zeroes instead of x's). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:46:41 -07:00
Jeff King	0016043bf4	get_short_sha1: refactor init of disambiguation code The disambiguation machinery has two callers: get_short_sha1 and for_each_abbrev. Both need to repeat much of the same setup: declaring buffers, sanity-checking lengths, preparing the prefixes, etc. Let's pull that into a single init function so we can avoid repeating ourselves. Pulling the buffers into the "struct disambiguate_state" isn't strictly necessary, but it does make things simpler for the callers, who no longer have to worry about sizing them correctly (i.e., it's an implicit requirement that the caller provide 20- and 40-byte buffers). And while we're touching this code, we can convert any magic-number sizes to the more modern GIT_SHA1_* constants. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:46:39 -07:00
Jeff King	5d5def2aa5	get_short_sha1: parse tags when looking for treeish The treeish disambiguation function tries to peel tags, but it does so by calling: deref_tag(lookup_object(sha1), ...); This will only work if we have previously looked at the tag and created a "struct tag" for it. Since parsing revision arguments typically happens before anything else, this is usually not the case, and we would fail to peel the tag (we are lucky that deref_tag() gracefully handles the NULL and does not segfault). Instead, we can use parse_object(). Note that this is the same fix done by `94d75d1` (get_short_sha1(): correctly disambiguate type-limited abbreviation, 2013-07-01), but that commit fixed only the committish disambiguator, and left the bug in the treeish one. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:46:30 -07:00
Jeff King	8a10fea49b	get_sha1: propagate flags to child functions The get_sha1() function is actually implementation by many sub-functions, but we do not always pass our flags around to all of those functions. As a result, we may forget that our caller asked us to resolve with GET_SHA1_QUIETLY and output messages. The two triggerable cases are: 1. Resolving treeish:path will resolve the "treeish" portion using GET_SHA1_TREEISH, dropping all other flags. 2. The peel_onion() function did not take flags at all but recurses to get_sha1_1(), which does. The solution for both is to bitwise-OR their new flags with the existing ones (after dropping any mutually exclusive disambiguation flags). This bug can trigger with "git rev-parse --quiet", which asks for quiet resolution. But it can also happen in a more vanilla code path when we do a follow-up ONLY_TO_DIE invocation of get_sha1(), and that's what the tests check. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:46:30 -07:00
Jeff King	7243ffdd78	get_sha1: avoid repeating ourselves via ONLY_TO_DIE When the revision code cannot parse an argument like "HEAD:foo", it will call maybe_die_on_misspelt_object_name(), which re-runs get_sha1() with an extra ONLY_TO_DIE flag. We then spend more effort to generate a better error message. Unfortunately, a side effect is that our second call may repeat the same error messages from the original get_sha1() call. You can see this with: $ git show 0017 error: short SHA1 0017 is ambiguous. error: short SHA1 0017 is ambiguous. fatal: ambiguous argument '0017': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' where the second "error:" line comes from the ONLY_TO_DIE call. To fix this, we can make ONLY_TO_DIE imply QUIETLY. This is a little odd, because the whole point of ONLY_TO_DIE is to output error messages. But what we want to do is tell the rest of the get_sha1() code (particularly get_sha1_1()) that the _regular_ messages should be quiet, but the only-to-die ones should not. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:46:30 -07:00
Jeff King	259942f549	get_sha1: detect buggy calls with multiple disambiguators The get_sha1() family of functions takes a flags field, but some of the flags are mutually exclusive. In particular, we can only handle one disambiguating function, and the flags quietly override each other. Let's instead detect these as programming bugs. Technically some of the flags are supersets of the others, so treating COMMITTISH\|TREEISH as just COMMITTISH is not wrong, but it's a good sign the caller is confused. And certainly asking for BLOB\|TREE does not work. We can do the check easily with some bit-twiddling, and as a bonus, the bit-mask of disambiguators will come in handy in a future patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-26 11:21:28 -07:00
brian m. carlson	151b2911c1	sha1_name: convert get_sha1_mb to struct object_id All of the callers of this function use struct object_id, so rename it to get_oid_mb and make it take struct object_id instead of unsigned char *. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:43 -07:00
brian m. carlson	99d1a9861a	cache: convert struct cache_entry to use struct object_id Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
Junio C Hamano	8429f2b42d	Merge branch 'bc/object-id' Move from unsigned char[20] to struct object_id continues. * bc/object-id: match-trees: convert several leaf functions to use struct object_id tree-walk: convert tree_entry_extract() to use struct object_id struct name_entry: use struct object_id instead of unsigned char sha1[20] match-trees: convert shift_tree() and shift_tree_by() to use object_id test-match-trees: convert to use struct object_id sha1-name: introduce a get_oid() function	2016-05-06 14:45:44 -07:00
brian m. carlson	2764fd93ad	sha1-name: introduce a get_oid() function The get_oid() function is equivalent to the get_sha1() function, but uses a struct object_id instead. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-19 15:28:58 -07:00
Jeff King	46c3cd44d7	setup: make startup_info available everywhere Commit `a60645f` (setup: remember whether repository was found, 2010-08-05) introduced the startup_info structure, which records some parts of the setup_git_directory() process (notably, whether we actually found a repository or not). One of the uses of this data is for functions to behave appropriately based on whether we are in a repo. But the startup_info struct is just a pointer to storage provided by the main program, and the only program that sets it up is the git.c wrapper. Thus builtins have access to startup_info, but externally linked programs do not. Worse, library code which is accessible from both has to be careful about accessing startup_info. This can be used to trigger a die("BUG") via get_sha1(): $ git fast-import <<-\EOF tag foo from HEAD:./whatever EOF fatal: BUG: startup_info struct is not initialized. Obviously that's fairly nonsensical input to feed to fast-import, but we should never hit a die("BUG"). And there may be other ways to trigger it if other non-builtins resolve sha1s. So let's point the storage for startup_info to a static variable in setup.c, making it available to all users of the library code. We _could_ turn startup_info into a regular extern struct, but doing so would mean tweaking all of the existing use sites. So let's leave the pointer indirection in place. We can, however, drop any checks for NULL, as they will always be false (and likewise, we can drop the test covering this case, which was a rather artificial situation using one of the test-* programs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-06 17:17:37 -08:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Junio C Hamano	e6a6a768ca	Merge branch 'nd/dwim-wildcards-as-pathspecs' "git show 'HEAD:Foo[BAR]Baz'" did not interpret the argument as a rev, i.e. the object named by the the pathname with wildcard characters in a tree object. * nd/dwim-wildcards-as-pathspecs: get_sha1: don't die() on bogus search strings check_filename: tighten dwim-wildcard ambiguity checkout: reorder check_filename conditional	2016-02-24 13:25:52 -08:00
Jeff King	50a6c8efa2	use st_add and st_mult for allocation size computation If our size computation overflows size_t, we may allocate a much smaller buffer than we expected and overflow it. It's probably impossible to trigger an overflow in most of these sites in practice, but it is easy enough convert their additions and multiplications into overflow-checking variants. This may be fixing real bugs, and it makes auditing the code easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Junio C Hamano	fb795323ce	Merge branch 'wp/sha1-name-negative-match' A new "<branch>^{/!-<pattern>}" notation can be used to name a commit that is reachable from <branch> that does not match the given <pattern>. * wp/sha1-name-negative-match: object name: introduce '^{/!-<negative pattern>}' notation test for '!' handling in rev-parse's named commits	2016-02-10 14:20:10 -08:00
Jeff King	aac4fac168	get_sha1: don't die() on bogus search strings The get_sha1() function generally returns an error code rather than dying, and we sometimes speculatively call it with something that may be a revision or a pathspec, in order to see which one it might be. If it sees a bogus ":/" search string, though, it complains, without giving the caller the opportunity to recover. We can demonstrate this in t6133 by looking for ":/.t", which should mean ".t at the root of the tree", but instead dies because of the invalid regex (the "" has nothing to operate on). We can fix this by returning an error rather than calling die(). Unfortunately, the tradeoff is that the error message is slightly worse in cases where we _do_ know we have a rev. E.g., running "git log ':/.t' --" before yielded: fatal: Invalid search pattern: .t and now we get only: fatal: bad revision ':/.t' There's not a simple way to fix this short of passing a "quiet" flag all the way through the get_sha1() stack. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-10 13:53:21 -08:00
Will Palmer	0769854f3d	object name: introduce '^{/!-<negative pattern>}' notation To name a commit, you can now use the :/!-<negative pattern> regex style, and consequentially, say $ git rev-parse HEAD^{/!-foo} and it will return the hash of the first commit reachable from HEAD, whose commit message does not contain "foo". This is the opposite of the existing <rev>^{/<pattern>} syntax. The specific use-case this is intended for is to perform an operation, excluding the most-recent commits containing a particular marker. For example, if you tend to make "work in progress" commits, with messages beginning with "WIP", you work, then it could be useful to diff against "the most recent commit which was not a WIP commit". That sort of thing now possible, via commands such as: $ git diff @^{/!-^WIP} The leader '/!-', rather than simply '/!', to denote a negative match, is chosen to leave room for additional modifiers in the future. Signed-off-by: Will Palmer <wmpalmer@gmail.com> Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-01 13:40:37 -08:00
brian m. carlson	ed1c9977cb	Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	f2fd0760f6	Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	7999b2cf77	Add several uses of get_object_hash. Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
Jeff King	43bb66ae0b	diagnose_invalid_index_path: use strbuf to avoid strcpy/strcat We dynamically allocate a buffer and then strcpy and strcat into it. This isn't buggy, but we'd prefer to avoid these suspicious functions. This would be a good candidate for converstion to xstrfmt, but we need to record the length for dealing with index entries. A strbuf handles that for us. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-05 11:08:05 -07:00
Jeff King	c3bb0ac796	find_short_object_filename: convert sprintf to xsnprintf We use sprintf() to format some hex data into a buffer. The buffer is clearly long enough, and using snprintf here is not necessary. And in fact, it does not really make anything easier to audit, as the size we feed to snprintf accounts for the magic extra 42 bytes found in each alt->name field of struct alternate_object_database (which is there exactly to do this formatting). Still, it is nice to remove an sprintf call and replace it with an xsnprintf and explanatory comment, which makes it easier to audit the code base for overflows. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-09-25 10:18:18 -07:00
Jeff King	af49c6d091	add reentrant variants of sha1_to_hex and find_unique_abbrev The sha1_to_hex and find_unique_abbrev functions always write into reusable static buffers. There are a few problems with this: - future calls overwrite our result. This is especially annoying with find_unique_abbrev, which does not have a ring of buffers, so you cannot even printf() a result that has two abbreviated sha1s. - if you want to put the result into another buffer, we often strcpy, which looks suspicious when auditing for overflows. This patch introduces sha1_to_hex_r and find_unique_abbrev_r, which write into a user-provided buffer. Of course this is just punting on the overflow-auditing, as the buffer obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is much easier to audit, since that is a well-known size. We retain the non-reentrant forms, which just become thin wrappers around the reentrant ones. This patch also adds a strbuf variant of find_unique_abbrev, which will be handy in later patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-09-25 10:18:18 -07:00
Jeff King	a5481a6c94	convert "enum date_mode" into a struct In preparation for adding date modes that may carry extra information beyond the mode itself, this patch converts the date_mode enum into a struct. Most of the conversion is fairly straightforward; we pass the struct as a pointer and dereference the type field where necessary. Locations that declare a date_mode can use a "{}" constructor. However, the tricky case is where we use the enum labels as constants, like: show_date(t, tz, DATE_NORMAL); Ideally we could say: show_date(t, tz, &{ DATE_NORMAL }); but of course C does not allow that. Likewise, we cannot cast the constant to a struct, because we need to pass an actual address. Our options are basically: 1. Manually add a "struct date_mode d = { DATE_NORMAL }" definition to each caller, and pass "&d". This makes the callers uglier, because they sometimes do not even have their own scope (e.g., they are inside a switch statement). 2. Provide a pre-made global "date_normal" struct that can be passed by address. We'd also need "date_rfc2822", "date_iso8601", and so forth. But at least the ugliness is defined in one place. 3. Provide a wrapper that generates the correct struct on the fly. The big downside is that we end up pointing to a single global, which makes our wrapper non-reentrant. But show_date is already not reentrant, so it does not matter. This patch implements 3, along with a minor macro to keep the size of the callers sane. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-29 11:39:07 -07:00
Junio C Hamano	5455ee0573	Merge branch 'bc/object-id' for_each_ref() callback functions were taught to name the objects not with "unsigned char sha1[20]" but with "struct object_id". * bc/object-id: (56 commits) struct ref_lock: convert old_sha1 member to object_id warn_if_dangling_symref(): convert local variable "junk" to object_id each_ref_fn_adapter(): remove adapter rev_list_insert_ref(): remove unneeded arguments rev_list_insert_ref_oid(): new function, taking an object_oid mark_complete(): remove unneeded arguments mark_complete_oid(): new function, taking an object_oid clear_marks(): rewrite to take an object_id argument mark_complete(): rewrite to take an object_id argument send_ref(): convert local variable "peeled" to object_id upload-pack: rewrite functions to take object_id arguments find_symref(): convert local variable "unused" to object_id find_symref(): rewrite to take an object_id argument write_one_ref(): rewrite to take an object_id argument write_refs_to_temp_dir(): convert local variable sha1 to object_id submodule: rewrite to take an object_id argument shallow: rewrite functions to take object_id arguments handle_one_ref(): rewrite to take an object_id argument add_info_ref(): rewrite to take an object_id argument handle_one_reflog(): rewrite to take an object_id argument ...	2015-06-05 12:17:37 -07:00
Junio C Hamano	c4a8354bc1	Merge branch 'jk/at-push-sha1' Introduce <branch>@{push} short-hand to denote the remote-tracking branch that tracks the branch at the remote the <branch> would be pushed to. * jk/at-push-sha1: for-each-ref: accept "%(push)" format for-each-ref: use skip_prefix instead of starts_with sha1_name: implement @{push} shorthand sha1_name: refactor interpret_upstream_mark sha1_name: refactor upstream_mark remote.c: add branch_get_push remote.c: return upstream name from stat_tracking_info remote.c: untangle error logic in branch_get_upstream remote.c: report specific errors from branch_get_upstream remote.c: introduce branch_get_upstream helper remote.c: hoist read_config into remote_get_1 remote.c: provide per-branch pushremote name remote.c: hoist branch.*.remote lookup out of remote_get_1 remote.c: drop "remote" pointer from "struct branch" remote.c: refactor setup of branch->merge list remote.c: drop default_remote_name variable	2015-06-05 12:17:36 -07:00
Junio C Hamano	67f0b6f3b2	Merge branch 'dt/cat-file-follow-symlinks' "git cat-file --batch(-check)" learned the "--follow-symlinks" option that follows an in-tree symbolic link when asked about an object via extended SHA-1 syntax, e.g. HEAD:RelNotes that points at Documentation/RelNotes/2.5.0.txt. With the new option, the command behaves as if HEAD:Documentation/RelNotes/2.5.0.txt was given as input instead. * dt/cat-file-follow-symlinks: cat-file: add --follow-symlinks to --batch sha1_name: get_sha1_with_context learns to follow symlinks tree-walk: learn get_tree_entry_follow_symlinks	2015-06-01 12:45:16 -07:00
Michael Haggerty	9c5fe0b846	handle_one_ref(): rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:35 -07:00
Michael Haggerty	2b2a5be394	each_ref_fn: change to take an object_id parameter Change typedef each_ref_fn to take a "const struct object_id oid" parameter instead of "const unsigned char sha1". To aid this transition, implement an adapter that can be used to wrap old-style functions matching the old typedef, which is now called "each_ref_sha1_fn"), and make such functions callable via the new interface. This requires the old function and its cb_data to be wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be used as the cb_data for an adapter function, each_ref_fn_adapter(). This is an enormous diff, but most of it consists of simple, mechanical changes to the sites that call any of the "for_each_ref" family of functions. Subsequent to this change, the call sites can be rewritten one by one to use the new interface. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:27 -07:00
Jeff King	adfe5d0434	sha1_name: implement @{push} shorthand In a triangular workflow, each branch may have two distinct points of interest: the @{upstream} that you normally pull from, and the destination that you normally push to. There isn't a shorthand for the latter, but it's useful to have. For instance, you may want to know which commits you haven't pushed yet: git log @{push}.. Or as a more complicated example, imagine that you normally pull changes from origin/master (which you set as your @{upstream}), and push changes to your own personal fork (e.g., as myfork/topic). You may push to your fork from multiple machines, requiring you to integrate the changes from the push destination, rather than upstream. With this patch, you can just do: git rebase @{push} rather than typing out the full name. The heavy lifting is all done by branch_get_push; here we just wire it up to the "@{push}" syntax. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-22 09:33:08 -07:00
Jeff King	48c58471c2	sha1_name: refactor interpret_upstream_mark Now that most of the logic for our local get_upstream_branch has been pushed into the generic branch_get_upstream, we can fold the remainder into interpret_upstream_mark. Furthermore, what remains is generic to any branch-related "@{foo}" we might add in the future, and there's enough boilerplate that we'd like to reuse it. Let's parameterize the two operations (parsing the mark and computing its value) so that we can reuse this for "@{push}" in the near future. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-22 09:33:08 -07:00
Jeff King	a1ad0eb0cb	sha1_name: refactor upstream_mark We will be adding new mark types in the future, so separate the suffix data from the logic. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-22 09:33:08 -07:00
Jeff King	3a429d0af3	remote.c: report specific errors from branch_get_upstream When the previous commit introduced the branch_get_upstream helper, there was one call-site that could not be converted: the one in sha1_name.c, which gives detailed error messages for each possible failure. Let's teach the helper to optionally report these specific errors. This lets us convert another callsite, and means we can use the helper in other locations that want to give the same error messages. The logic and error messages come straight from sha1_name.c, with the exception that we start each error with a lowercase letter, as is our usual style (note that a few tests need updated as a result). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-21 11:07:46 -07:00
René Scharfe	dbe44faadb	use file_exists() to check if a file exists in the worktree Call file_exists() instead of open-coding it. That's shorter, simpler and the intent becomes clearer. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-20 13:49:10 -07:00
David Turner	c4ec96774b	sha1_name: get_sha1_with_context learns to follow symlinks Wire up get_sha1_with_context to call get_tree_entry_follow_symlinks when GET_SHA1_FOLLOW_SYMLINKS is passed in flags. G_S_FOLLOW_SYMLINKS is incompatible with G_S_ONLY_TO_DIE because the diagnosis that ONLY_TO_DIE triggers does not at present consider symlinks, and it would be a significant amount of additional code to allow it to do so. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-20 13:46:13 -07:00
Junio C Hamano	daea6fca35	Merge branch 'rs/use-isxdigit' Code cleanup. * rs/use-isxdigit: use isxdigit() for checking if a character is a hexadecimal digit	2015-03-20 13:11:52 -07:00
René Scharfe	6f75d45b24	use isxdigit() for checking if a character is a hexadecimal digit Use the standard function isxdigit() to make the intent clearer and avoid using magic constants. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-10 15:44:41 -07:00
Junio C Hamano	8a6444d50e	Merge branch 'rs/simple-cleanups' Code cleanups. * rs/simple-cleanups: sha1_name: use strlcpy() to copy strings pretty: use starts_with() to check for a prefix for-each-ref: use skip_prefix() to avoid duplicate string comparison connect: use strcmp() for string comparison	2015-03-05 12:45:42 -08:00
René Scharfe	2ce63e9fac	sha1_name: use strlcpy() to copy strings Use strlcpy() instead of calling strncpy() and then setting the last byte of the target buffer to NUL explicitly. This shortens and simplifies the code a bit. Signed-of-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-02-22 12:01:38 -08:00
Junio C Hamano	098501527f	Merge branch 'jc/merge-bases' The get_merge_bases() API was easy to misuse by careless copy&paste coders, leaving object flags tainted in the commits that needed to be traversed. jc/merge-bases: get_merge_bases(): always clean-up object flags bisect: clean flags after checking merge bases	2015-01-07 12:55:05 -08:00
Junio C Hamano	5109f2aaab	Merge branch 'mh/find-uniq-abbrev' The code to abbreviate an object name to its short unique prefix has been optimized when no abbreviation was requested. * mh/find-uniq-abbrev: sha1_name: avoid unnecessary sha1 lookup in find_unique_abbrev	2014-12-22 12:26:58 -08:00
Mike Hommey	61e704e38a	sha1_name: avoid unnecessary sha1 lookup in find_unique_abbrev An example where this happens is when doing an ls-tree on a tree that contains a commit link. In that case, find_unique_abbrev is called to get a non-abbreviated hex sha1, but still, a lookup is done as to whether the sha1 is in the repository (which ends up looking for a loose object in .git/objects), while the result of that lookup is not used when returning a non-abbreviated hex sha1. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-11-26 10:51:05 -08:00
Junio C Hamano	2ce406ccb8	get_merge_bases(): always clean-up object flags The callers of get_merge_bases() can choose to leave object flags used during the merge-base traversal by passing cleanup=0 as a parameter, but in practice a very few callers can afford to do so (namely, "git merge-base"), as they need to compute merge base in preparation for other processing of their own and they need to see the object without contaminate flags. Change the function signature of get_merge_bases_many() and get_merge_bases() to drop the cleanup parameter, so that the majority of the callers do not have to say ", 1" at the end. Give a new get_merge_bases_many_dirty() API to support only a few callers that know they do not need to spend cycles cleaning up the object flags. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-30 12:51:10 -07:00
David Aguilar	c41a87dd80	refs: make rev-parse --quiet actually quiet When a reflog is deleted, e.g. when "git stash" clears its stashes, "git rev-parse --verify --quiet" dies: fatal: Log for refs/stash is empty. The reason is that the get_sha1() code path does not allow us to suppress this message. Pass the flags bitfield through get_sha1_with_context() so that read_ref_at() can suppress the message. Use get_sha1_with_context1() instead of get_sha1() in rev-parse so that the --quiet flag is honored. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-09-19 10:46:15 -07:00
Junio C Hamano	294792326a	Merge branch 'rs/list-optim' Fix a couple of "accumulate into a sorted list" to "accumulate and then sort the list". * rs/list-optim: walker: avoid quadratic list insertion in mark_complete sha1_name: avoid quadratic list insertion in handle_one_ref	2014-09-11 10:33:35 -07:00
René Scharfe	e8d1dfe639	sha1_name: avoid quadratic list insertion in handle_one_ref Similar to `16445242` (fetch-pack: avoid quadratic list insertion in mark_complete), sort only after all refs are collected instead of while inserting. The result is the same, but it's more efficient that way. The difference will only be measurable in repositories with a large number of refs. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-25 10:27:52 -07:00
Junio C Hamano	ad524f834a	Merge branch 'jk/misc-fixes-maint' * jk/misc-fixes-maint: apply: avoid possible bogus pointer fix memory leak parsing core.commentchar transport: fix leaks in refs_from_alternate_cb free ref string returned by dwim_ref receive-pack: don't copy "dir" parameter	2014-07-28 11:30:41 -07:00
Jeff King	28b3563241	free ref string returned by dwim_ref A call to "dwim_ref(name, len, flags, &ref)" will allocate a new string in "ref" to return the exact ref we found. We do not consistently free it in all code paths, leading to small leaks. The worst is in get_sha1_basic, which may be called many times (e.g., by "cat-file --batch"), though it is relatively unlikely, as it only triggers on a bogus reflog specification. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-24 13:57:49 -07:00
René Scharfe	e992d1eb39	use strbuf_addbuf for adding strbufs Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-10 14:06:45 -07:00
Junio C Hamano	3b8e8af187	Merge branch 'jk/xstrfmt' * jk/xstrfmt: setup_git_env(): introduce git_path_from_env() helper unique_path: fix unlikely heap overflow walker_fetch: fix minor memory leak merge: use argv_array when spawning merge strategy sequencer: use argv_array_pushf setup_git_env: use git_pathdup instead of xmalloc + sprintf use xstrfmt to replace xmalloc + strcpy/strcat use xstrfmt to replace xmalloc + sprintf use xstrdup instead of xmalloc + strcpy use xstrfmt in favor of manual size calculations strbuf: add xstrfmt helper	2014-07-09 11:34:05 -07:00
Junio C Hamano	e91ae32a01	Merge branch 'jk/skip-prefix' * jk/skip-prefix: http-push: refactor parsing of remote object names imap-send: use skip_prefix instead of using magic numbers use skip_prefix to avoid repeated calculations git: avoid magic number with skip_prefix fetch-pack: refactor parsing in get_ack fast-import: refactor parsing of spaces stat_opt: check extra strlen call daemon: use skip_prefix to avoid magic numbers fast-import: use skip_prefix for parsing input use skip_prefix to avoid repeating strings use skip_prefix to avoid magic numbers transport-helper: avoid reading past end-of-string fast-import: fix read of uninitialized argv memory apply: use skip_prefix instead of raw addition refactor skip_prefix to return a boolean avoid using skip_prefix as a boolean daemon: mark some strings as const parse_diff_color_slot: drop ofs parameter	2014-07-09 11:33:28 -07:00
Jeff King	95b567c7c3	use skip_prefix to avoid repeating strings It's a common idiom to match a prefix and then skip past it with strlen, like: if (starts_with(foo, "bar")) foo += strlen("bar"); This avoids magic numbers, but means we have to repeat the string (and there is no compiler check that we didn't make a typo in one of the strings). We can use skip_prefix to handle this case without repeating ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:44:45 -07:00
Jeff King	b2724c8787	use xstrfmt to replace xmalloc + strcpy/strcat It's easy to get manual allocation calculations wrong, and the use of strcpy/strcat raise red flags for people looking for buffer overflows (though in this case each site was fine). It's also shorter to use xstrfmt, and the printf-format tends to be easier for a reader to see what the final string will look like. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-19 15:20:54 -07:00
Jeff King	8597ea3afe	commit: record buffer length in cache Most callsites which use the commit buffer try to use the cached version attached to the commit, rather than re-reading from disk. Unfortunately, that interface provides only a pointer to the NUL-terminated buffer, with no indication of the original length. For the most part, this doesn't matter. People do not put NULs in their commit messages, and the log code is happy to treat it all as a NUL-terminated string. However, some code paths do care. For example, when checking signatures, we want to be very careful that we verify all the bytes to avoid malicious trickery. This patch just adds an optional "size" out-pointer to get_commit_buffer and friends. The existing callers all pass NULL (there did not seem to be any obvious sites where we could avoid an immediate strlen() call, though perhaps with some further refactoring we could). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:09:38 -07:00
Jeff King	ba41c1c93f	use get_commit_buffer to avoid duplicate code For both of these sites, we already do the "fallback to read_sha1_file" trick. But we can shorten the code by just using get_commit_buffer. Note that the error cases are slightly different when read_sha1_file fails. get_commit_buffer will die() if the object cannot be loaded, or is a non-commit. For get_sha1_oneline, this will almost certainly never happen, as we will have just called parse_object (and if it does, it's probably worth complaining about). For record_author_date, the new behavior is probably better; we notify the user of the error instead of silently ignoring it. And because it's used only for sorting by author-date, somebody examining a corrupt repo can fallback to the regular traversal order. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:08:17 -07:00
Junio C Hamano	b407d40933	Merge branch 'nd/log-show-linear-break' Attempts to show where a single-strand-of-pearls break in "git log" output. * nd/log-show-linear-break: log: add --show-linear-break to help see non-linear history object.h: centralize object flag allocation	2014-04-03 12:38:11 -07:00
Nguyễn Thái Ngọc Duy	208acbfb82	object.h: centralize object flag allocation While the field "flags" is mainly used by the revision walker, it is also used in many other places. Centralize the whole flag allocation to one place for a better overview (and easier to move flags if we have too). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-25 15:09:24 -07:00
Junio C Hamano	4e9f9320e3	Merge branch 'jk/interpret-branch-name-fix' Fix a handful of bugs around interpreting $branch@{upstream} notation and its lookalike, when $branch part has interesting characters, e.g. "@", and ":". * jk/interpret-branch-name-fix: interpret_branch_name: find all possible @-marks interpret_branch_name: avoid @{upstream} past colon interpret_branch_name: always respect "namelen" parameter interpret_branch_name: rename "cp" variable to "at" interpret_branch_name: factor out upstream handling	2014-01-27 10:44:21 -08:00
Jeff King	9892d5d454	interpret_branch_name: find all possible @-marks When we parse a string like "foo@{upstream}", we look for the first "@"-sign, and check to see if it is an upstream mark. However, since branch names can contain an @, we may also see "@foo@{upstream}". In this case, we check only the first @, and ignore the second. As a result, we do not find the upstream. We can solve this by iterating through all @-marks in the string, and seeing if any is a legitimate upstream or empty-at mark. Another strategy would be to parse from the right-hand side of the string. However, that does not work for the "empty_at" case, which allows "@@{upstream}". We need to find the left-most one in this case (and we then recurse as "HEAD@{upstream}"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 12:51:14 -08:00
Jeff King	3f6eb30f1d	interpret_branch_name: avoid @{upstream} past colon get_sha1() cannot currently parse a valid object name like "HEAD:@{upstream}" (assuming that such an oddly named file exists in the HEAD commit). It takes two passes to parse the string: 1. It first considers the whole thing as a ref, which results in looking for the upstream of "HEAD:". 2. It finds the colon, parses "HEAD" as a tree-ish, and then finds the path "@{upstream}" in the tree. For a path that looks like a normal reflog (e.g., "HEAD:@{yesterday}"), the first pass is a no-op. We try to dwim_ref("HEAD:"), that returns zero refs, and we proceed with colon-parsing. For "HEAD:@{upstream}", though, the first pass ends up in interpret_upstream_mark, which tries to find the branch "HEAD:". When it sees that the branch does not exist, it actually dies rather than returning an error to the caller. As a result, we never make it to the second pass. One obvious way of fixing this would be to teach interpret_upstream_mark to simply report "no, this isn't an upstream" in such a case. However, that would make the error-reporting for legitimate upstream cases significantly worse. Something like "bogus@{upstream}" would simply report "unknown revision: bogus@{upstream}", while the current code diagnoses a wide variety of possible misconfigurations (no such branch, branch exists but does not have upstream, etc). However, we can take advantage of the fact that a branch name cannot contain a colon. Therefore even if we find an upstream mark, any prefix with a colon must mean that the upstream mark we found is actually a pathname, and should be disregarded completely. This patch implements that logic. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 12:43:29 -08:00
Jeff King	8cd4249c4c	interpret_branch_name: always respect "namelen" parameter interpret_branch_name gets passed a "name" buffer to parse, along with a "namelen" parameter representing its length. If "namelen" is zero, we fallback to the NUL-terminated string-length of "name". However, it does not necessarily follow that if we have gotten a non-zero "namelen", it is the NUL-terminated string-length of "name". E.g., when get_sha1() is parsing "foo:bar", we will be asked to operate only on the first three characters. Yet in interpret_branch_name and its helpers, we use string functions like strchr() to operate on "name", looking past the length we were given. This can result in us mis-parsing object names. We should instead be limiting our search to "namelen" bytes. There are three distinct types of object names this patch addresses: - The intrepret_empty_at helper uses strchr to find the next @-expression after our potential empty-at. In an expression like "@:foo@bar", it erroneously thinks that the second "@" is relevant, even if we were asked only to look at the first character. This case is easy to trigger (and we test it in this patch). - When finding the initial @-mark for @{upstream}, we use strchr. This means we might treat "foo:@{upstream}" as the upstream for "foo:", even though we were asked only to look at "foo". We cannot test this one in practice, because it is masked by another bug (which is fixed in the next patch). - The interpret_nth_prior_checkout helper did not receive the name length at all. This turns out not to be a problem in practice, though, because its parsing is so limited: it always starts from the far-left of the string, and will not tolerate a colon (which is currently the only way to get a smaller-than-strlen "namelen"). However, it's still worth fixing to make the code more obviously correct, and to future-proof us against callers with more exotic buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 12:41:03 -08:00
Jeff King	f278f40f09	interpret_branch_name: rename "cp" variable to "at" In the original version of this function, "cp" acted as a pointer to many different things. Since the refactoring in the last patch, it only marks the at-sign in the string. Let's use a more descriptive variable name. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 12:38:47 -08:00
Jeff King	a39c14af82	interpret_branch_name: factor out upstream handling This function checks a few different @{}-constructs. The early part checks for and dispatches us to helpers for each construct, but the code for handling @{upstream} is inline. Let's factor this out into its own function. This makes interpret_branch_name more readable, and will make it much simpler to further refactor the function in future patches. While we're at it, let's also break apart the refactored code into a few helper functions. These will be useful if we eventually implement similar @{upstream}-like constructs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 12:38:30 -08:00
Junio C Hamano	0a8cb03555	Merge branch 'br/sha1-name-40-hex-no-disambiguation' When parsing a 40-hex string into the object name, the string is checked to see if it can be interpreted as a ref so that a warning can be given for ambiguity. The code kicked in even when the core.warnambiguousrefs is set to false to squelch this warning, in which case the cycles spent to look at the ref namespace were an expensive no-op, as the result was discarded without being used. * br/sha1-name-40-hex-no-disambiguation: sha1_name: don't resolve refs when core.warnambiguousrefs is false	2014-01-13 11:33:29 -08:00
Brodie Rao	832cf74c07	sha1_name: don't resolve refs when core.warnambiguousrefs is false When seeing a full 40-hex object name, get_sha1_basic() unconditionally checks if the string can also be interpreted as a refname, but the result will not be used unless warn_ambiguous_refs is in effect. Omitting this unnecessary ref resolution provides a substantial performance improvement, especially when passing many hashes to a command (like "git rev-list --stdin") and core.warnambiguousrefs is set to false. The check incurs 6 stat()s for every hash supplied, which can be costly over NFS. Signed-off-by: Brodie Rao <brodie@sf.io> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-07 09:51:56 -08:00
Junio C Hamano	ad70448576	Merge branch 'cc/starts-n-ends-with' Remove a few duplicate implementations of prefix/suffix comparison functions, and rename them to starts_with and ends_with. * cc/starts-n-ends-with: replace {pre,suf}fixcmp() with {starts,ends}_with() strbuf: introduce starts_with() and ends_with() builtin/remote: remove postfixcmp() and use suffixcmp() instead environment: normalize use of prefixcmp() by removing " != 0"	2013-12-17 12:02:44 -08:00
Christian Couder	5955654823	replace {pre,suf}fixcmp() with {starts,ends}_with() Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c \| grep -v strbuf\\.c \| xargs perl -pi -e ' s\|!prefixcmp\(\|starts_with\(\|g; s\|prefixcmp\(\|!starts_with\(\|g; s\|!suffixcmp\(\|ends_with\(\|g; s\|suffixcmp\(\|!ends_with\(\|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-05 14:13:21 -08:00
Junio C Hamano	5bb62059f2	Merge branch 'jk/robustify-parse-commit' * jk/robustify-parse-commit: checkout: do not die when leaving broken detached HEAD use parse_commit_or_die instead of custom message use parse_commit_or_die instead of segfaulting assume parse_commit checks for NULL commit assume parse_commit checks commit->object.parsed log_tree_diff: die when we fail to parse a commit	2013-12-05 12:54:01 -08:00
Felipe Contreras	57b15ead77	sha1-name: trivial style cleanup Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-31 13:47:19 -07:00
Jeff King	5e7d4d3e93	assume parse_commit checks for NULL commit The parse_commit function will check whether it was passed a NULL commit pointer, and if so, return an error. There is no need for callers to check this separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-24 15:43:50 -07:00
Junio C Hamano	f406140baa	Merge branch 'fc/at-head' Instead of typing four capital letters "HEAD", you can say "@" now, e.g. "git log @". * fc/at-head: Add new @ shortcut for HEAD sha1-name: pass len argument to interpret_branch_name()	2013-09-20 12:38:10 -07:00
Junio C Hamano	638924fec2	Merge branch 'rh/peeling-tag-to-tag' Make "foo^{tag}" to peel a tag to itself, i.e. no-op., and fail if "foo" is not a tag. "git rev-parse --verify v1.0^{tag}" would be a more convenient way to say "test $(git cat-file -t v1.0) = tag". * rh/peeling-tag-to-tag: peel_onion: do not assume length of x_type globals peel_onion(): add support for <rev>^{tag}	2013-09-20 12:27:18 -07:00
Felipe Contreras	9ba89f484e	Add new @ shortcut for HEAD Typing 'HEAD' is tedious, especially when we can use '@' instead. The reason for choosing '@' is that it follows naturally from the ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no operation, and when we don't have those, it makes sens to assume 'HEAD'. So now we can use 'git show @~1', and all that goody goodness. Until now '@' was a valid name, but it conflicts with this idea, so let's make it invalid. Probably very few people, if any, used this name. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-12 14:39:34 -07:00
Richard Hansen	a8a5406ab3	use 'commit-ish' instead of 'committish' Replace 'committish' in documentation and comments with 'commit-ish' to match gitglossary(7) and to be consistent with 'tree-ish'. The only remaining instances of 'committish' are: * variable, function, and macro names * "(also committish)" in the definition of commit-ish in gitglossary[7] Signed-off-by: Richard Hansen <rhansen@bbn.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-04 15:03:03 -07:00
Jeff King	c969b6a18d	peel_onion: do not assume length of x_type globals When we are parsing "rev^{foo}", we check "foo" against the various global type strings, like "commit_type", "tree_type", etc. This is nicely abstracted, but then we destroy the abstraction completely by using magic numbers that must match the length of the type strings. We could avoid these magic numbers by using skip_prefix. But taking a step back, we can realize that using the "commit_type" global is not really buying us anything. It is not ever going to change from being "commit" without causing severe breakage to existing uses. And even if it did change for some crazy reason, we would want to evaluate its effects on the "rev^{}" syntax, anyway. Let's just switch these to using a custom string literal, as we do for "rev^{object}". The resulting code is more robust to changes in the type strings, and is more readable. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-03 13:45:38 -07:00
Richard Hansen	75aa26d34c	peel_onion(): add support for <rev>^{tag} Complete the <rev>^{<type>} family of object descriptors by having <rev>^{tag} dereference <rev> until a tag object is found (or fail if unable). At first glance this may not seem very useful, as commits, trees, and blobs cannot be peeled to a tag, and a tag would just peel to itself. However, this can be used to ensure that <rev> names a tag object: $ git rev-parse --verify v1.8.4^{tag} 04f013dc38d7512eadb915eba22efc414f18b869 $ git rev-parse --verify master^{tag} error: master^{tag}: expected tag type, but the object dereferences to tree type fatal: Needed a single revision Users can already ensure that <rev> is a tag object by checking the output of 'git cat-file -t <rev>', but: * users may expect <rev>^{tag} to exist given that <rev>^{commit}, <rev>^{tree}, and <rev>^{blob} all exist * this syntax is more convenient/natural in some circumstances Signed-off-by: Richard Hansen <rhansen@bbn.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-03 13:09:17 -07:00
Felipe Contreras	cf99a761d3	sha1-name: pass len argument to interpret_branch_name() This is useful to make sure we don't step outside the boundaries of what we are interpreting at the moment. For example while interpreting foobar@{u}~1, the job of interpret_branch_name() ends right before ~1, but there's no way to figure that out inside the function, unless the len argument is passed. So let's do that. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-03 11:33:00 -07:00
Junio C Hamano	2c2b6646c2	Revert "Add new @ shortcut for HEAD" This reverts commit `cdfd94837b`, as it does not just apply to "@" (and forms with modifiers like @{u} applied to it), but also affects e.g. "refs/heads/@/foo", which it shouldn't. The basic idea of giving a short-hand might be good, and the topic can be retried later, but let's revert to avoid affecting existing use cases for now for the upcoming release.	2013-08-14 15:04:24 -07:00

1 2 3 4 5 ...

367 Commits