git-commit-vandalism

Author	SHA1	Message	Date
Jeff King	8e1c5fcf28	ref-filter: fix parsing of signatures with CRLF and no body This commit fixes a bug when parsing tags that have CRLF line endings, a signature, and no body, like this (the "^M" are marking the CRs): this is the subject^M -----BEGIN PGP SIGNATURE-----^M ^M ...some stuff...^M -----END PGP SIGNATURE-----^M When trying to find the start of the body, we look for a blank line separating the subject and body. In this case, there isn't one. But we search for it using strstr(), which will find the blank line in the signature. In the non-CRLF code path, we check whether the line we found is past the start of the signature, and if so, put the body pointer at the start of the signature (effectively making the body empty). But the CRLF code path doesn't catch the same case, and we end up with the body pointer in the middle of the signature field. This has two visible problems: - printing %(contents:subject) will show part of the signature, too, since the subject length is computed as (body - subject) - the length of the body is (sig - body), which makes it negative. Asking for %(contents:body) causes us to cast this to a very large size_t when we feed it to xmemdupz(), which then complains about trying to allocate too much memory. These are essentially the same bugs fixed in the previous commit, except that they happen when there is a CRLF blank line in the signature, rather than no blank line at all. Both are caused by the refactoring in `9f75ce3d8f` (ref-filter: handle CRLF at end-of-line more gracefully, 2020-10-29). We can fix this by doing the same "sigstart" check that we do in the non-CRLF case. And rather than repeat ourselves, we can just use short-circuiting OR to collapse both cases into a single conditional. I.e., rather than: if (strstr("\n\n")) ...found blank, check if it's in signature... else if (strstr("\r\n\r\n")) ...found blank, check if it's in signature... else ...no blank line found... we can collapse this to: if (strstr("\n\n")) \|\| strstr("\r\n\r\n"))) ...found blank, check if it's in signature... else ...no blank line found... The tests show the problem and the fix. Though it wasn't broken, I included contents:signature here to make sure it still behaves as expected, but note the shell hackery needed to make it work. A less-clever option would be to skip using test_atom and just "append_cr >expected" ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-02 21:36:04 -04:00
Jeff King	b01e1c7ef0	ref-filter: fix parsing of signatures without blank lines When ref-filter is asked to show %(content:subject), etc, we end up in find_subpos() to parse out the three major parts: the subject, the body, and the signature (if any). When searching for the blank line between the subject and body, if we don't find anything, we try to treat the whole message as the subject, with no body. But our idea of "the whole message" needs to take into account the signature, too. Since `9f75ce3d8f` (ref-filter: handle CRLF at end-of-line more gracefully, 2020-10-29), the code instead goes all the way to the end of the buffer, which produces confusing output. Here's an example. If we have a tag message like this: this is the subject -----BEGIN SSH SIGNATURE----- ...some stuff... -----END SSH SIGNATURE----- then the current parser will put the start of the body at the end of the whole buffer. This produces two buggy outcomes: - since the subject length is computed as (body - subject), showing %(contents:subject) will print both the subject and the signature, rather than just the single line - since the body length is computed as (sig - body), and the body now starts _after_ the signature, we end up with a negative length! Fortunately we never access out-of-bounds memory, because the negative length is fed to xmemdupz(), which casts it to a size_t, and xmalloc() bails trying to allocate an absurdly large value. In theory it would be possible for somebody making a malicious tag to wrap it around to a more reasonable value, but it would require a tag on the order of 2^63 bytes. And even if they did, all they get is an out of bounds string read. So the security implications are probably not interesting. We can fix both by correctly putting the start of the body at the same index as the start of the signature (effectively making the body empty). Note that this is a real issue with signatures generated with gpg.format set to "ssh", which would look like the example above. In the new tests here I use a hard-coded tag message, for a few reasons: - regardless of what the ssh-signing code produces now or in the future, we should be testing this particular case - skipping the actual signature makes the tests simpler to write (and allows them to run on more systems) - t6300 has helpers for working with gpg signatures; for the purposes of this bug, "BEGIN PGP" is just as good a demonstration, and this simplifies the tests Curiously, the same issue doesn't happen with real gpg signatures (and there are even existing tests in t6300 with cover this). Those have a blank line between the header and the content, like: this is the subject -----BEGIN PGP SIGNATURE----- ...some stuff... -----END PGP SIGNATURE----- Because we search for the subject/body separator line with a strstr(), we find the blank line in the signature, even though it's outside of what we'd consider the body. But that puts us unto a separate code path, which realizes that we're now in the signature and adjusts the line back to "sigstart". So this patch is basically just making the "no line found at all" case match that. And note that "sigstart" is always defined (if there is no signature, it points to the end of the buffer as you'd expect). Reported-by: Martin Englund <martin@englund.nu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-02 21:36:04 -04:00
Junio C Hamano	4f4b18497a	Merge branch 'es/test-chain-lint' Broken &&-chains in the test scripts have been corrected. * es/test-chain-lint: t6000-t9999: detect and signal failure within loop t5000-t5999: detect and signal failure within loop t4000-t4999: detect and signal failure within loop t0000-t3999: detect and signal failure within loop tests: simplify by dropping unnecessary `for` loops tests: apply modern idiom for exiting loop upon failure tests: apply modern idiom for signaling test failure tests: fix broken &&-chains in `{...}` groups tests: fix broken &&-chains in `$(...)` command substitutions tests: fix broken &&-chains in compound statements tests: use test_write_lines() to generate line-oriented output tests: simplify construction of large blocks of text t9107: use shell parameter expansion to avoid breaking &&-chain t6300: make `%(raw:size) --shell` test more robust t5516: drop unnecessary subshell and command invocation t4202: clarify intent by creating expected content less cleverly t1020: avoid aborting entire test script when one test fails t1010: fix unnoticed failure on Windows t/lib-pager: use sane_unset() to avoid breaking &&-chain	2022-01-03 16:24:15 -08:00
Eric Sunshine	0c51d6b4ae	t6000-t9999: detect and signal failure within loop Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `\|\| return 1` (or `\|\| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 10:29:48 -08:00
Eric Sunshine	efe47c83b2	t6300: make `%(raw:size) --shell` test more robust This test populates its `expect` file solely by appending content but fails to ensure that the file starts out empty. The test succeeds only because no earlier test populated a file of the exact same name, however this is an accident waiting to happen. Make the test more robust by ensuring that it contains exactly the intended content. While at it, simplify the implementation via a straightforward `sed` application and by avoiding dropping out of the single-quote context within the test body (thus eliminating a hard-to-digest combination of apostrophes and backslashes). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 10:29:48 -08:00
Johannes Schindelin	9f3547837e	tests: set GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME only when needed A couple of test scripts have actually been adapted to accommodate for a configurable default branch name, but they still overrode it via the `GIT_TEST_*` variable. Let's drop that override where possible. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-05 11:34:28 -08:00
Junio C Hamano	98e7ab6d42	for-each-ref: delay parsing of --sort=<atom> options The for-each-ref family of commands invoke parsers immediately when it sees each --sort=<atom> option, and die before even seeing the other options on the command line when the <atom> is unrecognised. Instead, accumulate them in a string list, and have them parsed into a ref_sorting structure after the command line parsing is done. As a consequence, "git branch --sort=bogus -h" used to fail to give the brief help, which arguably may have been a feature, now does so, which is more consistent with how other options work. The patch is smaller than the actual extent of the "damage" to the codebase, thanks to the fact that the original code consistently used OPT_REF_SORT() macro to handle command line options. We only needed to replace the variable used for the list, and implementation of the callback function used in the macro. The old rule was for the users of the API to: - Declare ref_sorting and ref_sorting_tail variables; - OPT_REF_SORT() macro will instantiate ref_sorting instance (which may barf and die) and append it to the tail; - Append to the tail each ref_sorting read from the configuration by parsing in the config callback (which may barf and die); - See if ref_sorting is null and use ref_sorting_default() instead. Now the rule is not all that different but is simpler: - Declare ref_sorting_options string list. - OPT_REF_SORT() macro will append it to the string list; - Append to the string list the sort key read from the configuration; - call ref_sorting_options() to turn the string list to ref_sorting structure (which also deals with the default value). As side effects, this change also cleans up a few issues: - `95be717c` (parse_opt_ref_sorting: always use with NONEG flag, 2019-03-20) muses that "git for-each-ref --no-sort" should simply clear the sort keys accumulated so far; it now does. - The implementation detail of "struct ref_sorting" and the helper function parse_ref_sorting() can now be private to the ref-filter API implementation. - If you set branch.sort to a bogus value, the any "git branch" invocation, not only the listing mode, would abort with the original code; now it doesn't Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-20 14:33:07 -07:00
Junio C Hamano	85246a7054	Merge branch 'dd/t6300-wo-gpg-fix' Test fix. * dd/t6300-wo-gpg-fix: t6300: check for cat-file exit status code t6300: don't run cat-file on non-existent object	2021-09-08 13:30:31 -07:00
Đoàn Trần Công Danh	1549577338	t6300: check for cat-file exit status code In test_atom(), we're piping the output of cat-file to tail(1), thus, losing its exit status. Let's use a temporary file to preserve git exit status code. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-25 11:45:54 -07:00
Đoàn Trần Công Danh	597fa8cb43	t6300: don't run cat-file on non-existent object In t6300, some tests are guarded behind some prerequisites. Thus, objects created by those tests ain't available if those prerequisites are unsatistified. Attempting to run "cat-file" on those objects will run into failure. In fact, running t6300 in an environment without gpg(1), we'll see those warnings: fatal: Not a valid object name refs/tags/signed-empty fatal: Not a valid object name refs/tags/signed-short fatal: Not a valid object name refs/tags/signed-long Let's put those commands into the real tests, in order to: * skip their execution if prerequisites aren't satistified. * check their exit status code The expected value for objects with type: commit needs to be computed outside the test because we can't rely on "$3" there. Furthermore, to prevent the accidental usage of that computed expected value, BUG out on unknown object's type. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-25 11:45:53 -07:00
ZheNing Hu	b9dee075eb	ref-filter: add %(rest) atom %(rest) is a atom used for cat-file batch mode, which can split the input lines at the first whitespace boundary, all characters before that whitespace are considered to be the object name; characters after that first run of whitespace (i.e., the "rest" of the line) are output in place of the %(rest) atom. In order to let "cat-file --batch=%(rest)" use the ref-filter interface, add %(rest) atom for ref-filter. Introduce the reject_atom() to reject the atom %(rest) for "git for-each-ref", "git branch", "git tag" and "git verify-tag". Reviewed-by: Jacob Keller <jacob.keller@gmail.com> Suggected-by: Jacob Keller <jacob.keller@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-26 12:01:26 -07:00
ZheNing Hu	7121c4d4e2	ref-filter: --format=%(raw) support --perl Because the perl language can handle binary data correctly, add the function perl_quote_buf_with_len(), which can specify the length of the data and prevent the data from being truncated at '\0' to help `--format="%(raw)"` support `--perl`. Reviewed-by: Jacob Keller <jacob.keller@gmail.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-26 12:01:25 -07:00
ZheNing Hu	bd0708c7eb	ref-filter: add %(raw) atom Add new formatting option `%(raw)`, which will print the raw object data without any changes. It will help further to migrate all cat-file formatting logic from cat-file to ref-filter. The raw data of blob, tree objects may contain '\0', but most of the logic in `ref-filter` depends on the output of the atom being text (specifically, no embedded NULs in it). E.g. `quote_formatting()` use `strbuf_addstr()` or `._quote_buf()` add the data to the buffer. The raw data of a tree object is `100644 one\0...`, only the `100644 one` will be added to the buffer, which is incorrect. Therefore, we need to find a way to record the length of the atom_value's member `s`. Although strbuf can already record the string and its length, if we want to replace the type of atom_value's member `s` with strbuf, many places in ref-filter that are filled with dynamically allocated mermory in `v->s` are not easy to replace. At the same time, we need to check if `v->s == NULL` in populate_value(), and strbuf cannot easily distinguish NULL and empty strings, but c-style "const char " can do it. So add a new member in `struct atom_value`: `s_size`, which can record raw object size, it can help us add raw object data to the buffer or compare two buffers which contain raw object data. Note that `--format=%(raw)` cannot be used with `--python`, `--shell`, `--tcl`, and `--perl` because if the binary raw data is passed to a variable in such languages, these may not support arbitrary binary data in their string variable type. Reviewed-by: Jacob Keller <jacob.keller@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Helped-by: Bagas Sanjaya <bagasdotme@gmail.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Felipe Contreras <felipe.contreras@gmail.com> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-26 12:01:25 -07:00
Junio C Hamano	522010b573	Merge branch 'ab/detox-gettext-tests' Test clean-up. * ab/detox-gettext-tests: tests: remove all uses of test_i18cmp	2021-04-20 17:23:36 -07:00
Junio C Hamano	f63add4aa8	Merge branch 'jk/ref-filter-segfault-fix' A NULL-dereference bug has been corrected in an error codepath in "git for-each-ref", "git branch --list" etc. * jk/ref-filter-segfault-fix: ref-filter: fix NULL check for parse object failure	2021-04-13 15:28:50 -07:00
Ævar Arnfjörð Bjarmason	feeb03bce6	tests: remove all uses of test_i18cmp Finish the removal I started in `1108cea7f8` (tests: remove most uses of test_i18ncmp, 2021-02-11). At that time the function wasn't removed due to disruption with in-flight changes, remove the occurrences that have landed since then. As of writing this there are no test_i18ncmp uses between "master" and "seen", so let's also remove the function to finally put it to rest. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:41:24 -07:00
Jeff King	c685450880	ref-filter: fix NULL check for parse object failure After we run parse_object_buffer() to get an object's contents, we try to check that the return value wasn't NULL. However, since our "struct object" is a pointer-to-pointer, and we assign like: *obj = parse_object_buffer(...); it's not correct to check: if (!obj) That will always be true, since our double pointer will continue to point to the single pointer (which is itself NULL). This is a regression that was introduced by `aa46a0da30` (ref-filter: use oid_object_info() to get object, 2018-07-17); since that commit we'll segfault on a parse failure, as we try to look at the NULL object pointer. There are many ways a parse could fail, but most of them are hard to set up in the tests (it's easy to make a bogus object, but update-ref will refuse to point to it). The test here uses a tag which points to a wrong object type. A parse of just the broken tag object will succeed, but seeing both tag objects in the same process will lead to a parse error (since we'll see the pointed-to object as both types). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 12:54:21 -07:00
Hariom Verma	ee82a487f6	ref-filter: use pretty.c logic for trailers Now, ref-filter is using pretty.c logic for setting trailer options. New to ref-filter: :key=<K> - only show trailers with specified key. :valueonly[=val] - only show the value part. :separator=<SEP> - inserted between trailer lines. :key_value_separator=<SEP> - inserted between key and value in trailer lines Enhancement to existing options(now can take value and its optional): :only[=val] :unfold[=val] 'val' can be: true, on, yes or false, off, no. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 16:48:38 -08:00
Hariom Verma	727331dce1	t6300: use function to test trailer options Add a function to test trailer options. This will make tests look cleaner, as well as will make it easier to add new tests for trailers in the future. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-15 16:48:38 -08:00
Junio C Hamano	27d7c8599b	Merge branch 'js/default-branch-name-tests-final-stretch' Prepare tests not to be affected by the name of the default branch "git init" creates. * js/default-branch-name-tests-final-stretch: (28 commits) tests: drop prereq `PREPARE_FOR_MAIN_BRANCH` where no longer needed t99: adjust the references to the default branch name "main" tests(git-p4): transition to the default branch name `main` t9[5-7]: adjust the references to the default branch name "main" t9[0-4]: adjust the references to the default branch name "main" t8: adjust the references to the default branch name "main" t7[5-9]: adjust the references to the default branch name "main" t7[0-4]: adjust the references to the default branch name "main" t6[4-9]: adjust the references to the default branch name "main" t64: preemptively adjust alignment to prepare for `master` -> `main` t6[0-3]: adjust the references to the default branch name "main" t5[6-9]: adjust the references to the default branch name "main" t55[4-9]: adjust the references to the default branch name "main" t55[23]: adjust the references to the default branch name "main" t551: adjust the references to the default branch name "main" t550: adjust the references to the default branch name "main" t5503: prepare aligned comment for replacing `master` with `main` t5[0-4]: adjust the references to the default branch name "main" t5323: prepare centered comment for `master` -> `main` t4: adjust the references to the default branch name "main" ...	2021-01-25 14:19:18 -08:00
Johannes Schindelin	8f19c9fd43	t6300: avoid using the default name of the initial branch Our test suite currently only passes when `git init` uses the name `master` for the initial branch. This would stop us from changing the default branch name. Let's adjust t6300 so that it does not rely on any specific default branch name. This trick is done by (force-)renaming the initial branch to the name `main` in the `setup` and the `:remotename and :remoteref` test cases, and then replacing all mentions of `master` and `MASTER` with `main` and `MAIN`, respectively. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-07 10:29:25 -08:00
Johannes Schindelin	334afbc76f	tests: mark tests relying on the current default for `init.defaultBranch` In addition to the manual adjustment to let the `linux-gcc` CI job run the test suite with `master` and then with `main`, this patch makes sure that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts that currently rely on the initial branch name being `master by default. To determine which test scripts to mark up, the first step was to force-set the default branch name to `master` in - all test scripts that contain the keyword `master`, - t4211, which expects `t/t4211/history.export` with a hard-coded ref to initialize the default branch, - t5560 because it sources `t/t556x_common` which uses `master`, - t8002 and t8012 because both source `t/annotate-tests.sh` which also uses `master`) This trick was performed by this command: $ sed -i '/^ \. \.\/$test-lib\\|lib-\(bash\\|cvs\\|git-svn$\\|gitweb-lib\)\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' $(git grep -l master t/t[0-9].sh) \ t/t4211.sh t/t5560.sh t/t8002.sh t/t8012.sh After that, careful, manual inspection revealed that some of the test scripts containing the needle `master` do not actually rely on a specific default branch name: either they mention `master` only in a comment, or they initialize that branch specificially, or they do not actually refer to the current default branch. Therefore, the aforementioned modification was undone in those test scripts thusly: $ git checkout HEAD -- \ t/t0027-auto-crlf.sh t/t0060-path-utils.sh \ t/t1011-read-tree-sparse-checkout.sh \ t/t1305-config-include.sh t/t1309-early-config.sh \ t/t1402-check-ref-format.sh t/t1450-fsck.sh \ t/t2024-checkout-dwim.sh \ t/t2106-update-index-assume-unchanged.sh \ t/t3040-subprojects-basic.sh t/t3301-notes.sh \ t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \ t/t3436-rebase-more-options.sh \ t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \ t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \ t/t5511-refspec.sh t/t5526-fetch-submodules.sh \ t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \ t/t5548-push-porcelain.sh \ t/t5552-skipping-fetch-negotiator.sh \ t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \ t/t5614-clone-submodules-shallow.sh \ t/t7508-status.sh t/t7606-merge-custom.sh \ t/t9302-fast-import-unpack-limit.sh We excluded one set of test scripts in these commands, though: the range of `git p4` tests. The reason? `git p4` stores the (foreign) remote branch in the branch called `p4/master`, which is obviously not the default branch. Manual analysis revealed that only five of these tests actually require a specific default branch name to pass; They were modified thusly: $ sed -i '/^ \. \.\/lib-git-p4\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' t/t980[0167].sh t/t9811*.sh Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:17 -08:00
Junio C Hamano	c25fba986b	Merge branch 'hv/ref-filter-misc' The "--format=" option to the "for-each-ref" command and friends learned a few more tricks, e.g. the ":short" suffix that applies to "objectname" now also can be used for "parent", "tree", etc. * hv/ref-filter-misc: ref-filter: add `sanitize` option for 'subject' atom pretty: refactor `format_sanitized_subject()` ref-filter: add `short` modifier to 'parent' atom ref-filter: add `short` modifier to 'tree' atom ref-filter: rename `objectname` related functions and fields ref-filter: modify error messages in `grab_objectname()` ref-filter: refactor `grab_objectname()` ref-filter: support different email formats	2020-09-09 13:53:07 -07:00
Junio C Hamano	e17723842b	Merge branch 'hv/ref-filter-trailers-atom-parsing-fix' The parser for "git for-each-ref --format=..." was too loose when parsing the "%(trailers...)" atom, and forgot that "trailers" and "trailers:<modifiers>" are the only two allowed forms, which has been corrected. * hv/ref-filter-trailers-atom-parsing-fix: ref-filter: 'contents:trailers' show error if `:` is missing t6300: unify %(trailers) and %(contents:trailers) tests	2020-08-31 15:49:47 -07:00
Hariom Verma	905f0a4e64	ref-filter: add `sanitize` option for 'subject' atom Currently, subject does not take any arguments. This commit introduce `sanitize` formatting option to 'subject' atom. `subject:sanitize` - print sanitized subject line, suitable for a filename. e.g. %(subject): "the subject line" %(subject:sanitize): "the-subject-line" Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 13:52:51 -07:00
Hariom Verma	26bc0aaf99	ref-filter: add `short` modifier to 'parent' atom Sometimes while using 'parent' atom, user might want to see abbrev hash instead of full 40 character hash. Just like 'objectname', it might be convenient for users to have the `:short` and `:short=<length>` option for printing 'parent' hash. Let's introduce `short` option to 'parent' atom. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 13:52:50 -07:00
Hariom Verma	837adb102f	ref-filter: add `short` modifier to 'tree' atom Sometimes while using 'tree' atom, user might want to see abbrev hash instead of full 40 character hash. Just like 'objectname', it might be convenient for users to have the `:short` and `:short=<length>` option for printing 'tree' hash. Let's introduce `short` option to 'tree' atom. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 13:52:50 -07:00
Hariom Verma	b82445dc27	ref-filter: support different email formats Currently, ref-filter only supports printing email with angle brackets. Let's add support for two more email options. - trim : for email without angle brackets. - localpart : for the part before the @ sign out of trimmed email Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 13:52:50 -07:00
Hariom Verma	2c22e102f8	ref-filter: 'contents:trailers' show error if `:` is missing The 'contents' atom does not show any error if used with 'trailers' atom and colon is missing before trailers arguments. e.g %(contents:trailersonly) works, while it shouldn't. It is definitely not an expected behavior. Let's fix this bug. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 14:46:22 -07:00
Hariom Verma	a8e0f50edc	t6300: unify %(trailers) and %(contents:trailers) tests Currently, there are different tests for testing %(trailers) and %(contents:trailers) causing redundant copy. Its time to get rid of duplicate code. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 12:13:26 -07:00
Alban Gruin	3db796c1c0	t6300: fix issues related to %(contents:size) `b6839fda68` (ref-filter: add support for %(contents:size), 2020-07-16) added a new format for ref-filter, and added a function to generate tests for this new feature in t6300. Unfortunately, it tries to run `test_expect_sucess' instead of `test_expect_success', and writes $expect to `expected', but tries to read `expect'. Those two issues were probably unnoticed because the script only printed errors, but did not crash. This fixes these issues. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-31 13:26:19 -07:00
Christian Couder	b6839fda68	ref-filter: add support for %(contents:size) It's useful and efficient to be able to get the size of the contents directly without having to pipe through `wc -c`. Also the result of the following: `git for-each-ref --format='%(contents)' refs/heads/my-branch \| wc -c` is off by one as `git for-each-ref` appends a newline character after the contents, which can be seen by comparing its output with the output from `git cat-file`. As with %(contents), %(contents:size) is silently ignored, if a ref points to something other than a commit or a tag: ``` $ git update-ref refs/mytrees/first HEAD^{tree} $ git for-each-ref --format='%(contents)' refs/mytrees/first $ git for-each-ref --format='%(contents:size)' refs/mytrees/first ``` Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-16 10:46:55 -07:00
Christian Couder	6e2ef8eb06	t6300: test refs pointing to tree and blob Adding tests for refs pointing to tree and blob shows that we care about testing both positive ("see, my shiny new toy does work") and negative ("and it won't do nonsensical things when given an input it is not designed to work with") cases. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-10 13:15:44 -07:00
Junio C Hamano	6de1630898	Merge branch 'jk/for-each-ref-multi-key-sort-fix' "git branch" and other "for-each-ref" variants accepted multiple --sort=<key> options in the increasing order of precedence, but it had a few breakages around "--ignore-case" handling, and tie-breaking with the refname, which have been fixed. * jk/for-each-ref-multi-key-sort-fix: ref-filter: apply fallback refname sort only after all user sorts ref-filter: apply --ignore-case to all sorting keys	2020-05-08 14:25:04 -07:00
Jeff King	7c5045fc18	ref-filter: apply fallback refname sort only after all user sorts Commit `9e468334b4` (ref-filter: fallback on alphabetical comparison, 2015-10-30) taught ref-filter's sort to fallback to comparing refnames. But it did it at the wrong level, overriding the comparison result for a single "--sort" key from the user, rather than after all sort keys have been exhausted. This worked correctly for a single "--sort" option, but not for multiple ones. We'd break any ties in the first key with the refname and never evaluate the second key at all. To make matters even more interesting, we only applied this fallback sometimes! For a field like "taggeremail" which requires a string comparison, we'd truly return the result of strcmp(), even if it was 0. But for numerical "value" fields like "taggerdate", we did apply the fallback. And that's why our multiple-sort test missed this: it uses taggeremail as the main comparison. So let's start by adding a much more rigorous test. We'll have a set of commits expressing every combination of two tagger emails, dates, and refnames. Then we can confirm that our sort is applied with the correct precedence, and we'll be hitting both the string and value comparators. That does show the bug, and the fix is simple: moving the fallback to the outer compare_refs() function, after all ref_sorting keys have been exhausted. Note that in the outer function we don't have an "ignore_case" flag, as it's part of each individual ref_sorting element. It's debatable what such a fallback should do, since we didn't use the user's keys to match. But until now we have been trying to respect that flag, so the least-invasive thing is to try to continue to do so. Since all callers in the current code either set the flag for all keys or for none, we can just pull the flag from the first key. In a hypothetical world where the user really can flip the case-insensitivity of keys separately, we may want to extend the code to distinguish that case from a blanket "--ignore-case". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 13:44:46 -07:00
Jeff King	76f9e569ad	ref-filter: apply --ignore-case to all sorting keys All of the ref-filter users (for-each-ref, branch, and tag) take an --ignore-case option which makes filtering and sorting case-insensitive. However, this option was applied only to the first element of the ref_sorting list. So: git for-each-ref --ignore-case --sort=refname would do what you expect, but: git for-each-ref --ignore-case --sort=refname --sort=taggername would sort the primary key (taggername) case-insensitively, but sort the refname case-sensitively. We have two options here: - teach callers to set ignore_case on the whole list - replace the ref_sorting list with a struct that contains both the list of sorting keys, as well as options that apply to _all_ keys I went with the first one here, as it gives more flexibility if we later want to let the users set the flag per-key (presumably through some special syntax when defining the key; for now it's all or nothing through --ignore-case). The new test covers this by sorting on both tagger and subject case-insensitively, which should compare "a" and "A" identically, but still sort them before "b" and "B". We'll break ties by sorting on the refname to give ourselves a stable output (this is actually supposed to be done automatically, but there's another bug which will be fixed in the next commit). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 13:41:20 -07:00
brian m. carlson	8bd5a2906e	t6300: make hash algorithm independent One of the git for-each-ref tests asks to sort by object ID. However, when sorted, the order of the refs differs between SHA-1 and SHA-256. Sort the expected output so that the test passes. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-24 09:33:24 -08:00
brian m. carlson	1f5f8f3e85	t6300: abstract away SHA-1-specific constants Adjust the test so that it computes variables for object IDs instead of using hard-coded hashes. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-24 09:33:24 -08:00
Junio C Hamano	a477abe9e4	Merge branch 'mp/for-each-ref-missing-name-or-email' "for-each-ref" and friends that shows refs did not protect themselves against ancient tags that did not record tagger names when asked to show "%(taggername)", which have been corrected. * mp/for-each-ref-missing-name-or-email: ref-filter: initialize empty name or email fields	2019-09-09 12:26:39 -07:00
Mischa POSLAWSKY	8b3f33ef11	ref-filter: initialize empty name or email fields Formatting $(taggername) on headerless tags such as v0.99 in Git causes a SIGABRT with error "munmap_chunk(): invalid pointer", because of an oversight in commit `f0062d3b74` (ref-filter: free item->value and item->value->s, 2018-10-19). Signed-off-by: Mischa POSLAWSKY <git@shiar.nl> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-22 11:14:45 -07:00
Taylor Blau	b31e2680c4	ref-filter.c: find disjoint pattern prefixes Since `cfe004a5a9` (ref-filter: limit traversal to prefix, 2017-05-22), the ref-filter code has sought to limit the traversals to a prefix of the given patterns. That code stopped short of handling more than one pattern, because it means invoking 'for_each_ref_in' multiple times. If we're not careful about which patterns overlap, we will output the same refs multiple times. For instance, consider the set of patterns 'refs/heads/a/', 'refs/heads/a/b/c', and 'refs/tags/v1.0.0'. If we naÃ¯vely ran: for_each_ref_in("refs/heads/a/", ...); for_each_ref_in("refs/heads/a/b/c", ...); for_each_ref_in("refs/tags/v1.0.0", ...); we would see 'refs/heads/a/b/c' (and everything underneath it) twice. Instead, we want to partition the patterns into disjoint sets, where we know that no ref will be matched by any two patterns in different sets. In the above, these are: - {'refs/heads/a/', 'refs/heads/a/b/c'}, and - {'refs/tags/v1.0.0'} Given one of these disjoint sets, what is a suitable pattern to pass to 'for_each_ref_in'? One approach is to compute the longest common prefix over all elements in that disjoint set, and let the caller cull out the refs they didn't want. Computing the longest prefix means that in most cases, we won't match too many things the caller would like to ignore. The longest common prefixes of the above are: - {'refs/heads/a/', 'refs/heads/a/b/c'} -> refs/heads/a/* - {'refs/tags/v1.0.0'} -> refs/tags/v1.0.0 We instead invoke: for_each_ref_in("refs/heads/a/*", ...); for_each_ref_in("refs/tags/v1.0.0", ...); Which provides us with the refs we were looking for with a minimal amount of extra cruft, but never a duplicate of the ref we asked for. Implemented here is an algorithm which accomplishes the above, which works as follows: 1. Lexicographically sort the given list of patterns. 2. Initialize 'prefix' to the empty string, where our goal is to build each element in the above set of longest common prefixes. 3. Consider each pattern in the given set, and emit 'prefix' if it reaches the end of a pattern, or touches a wildcard character. The end of a string is treated as if it precedes a wildcard. (Note that there is some room for future work to detect that, e.g., 'a?b' and 'abc' are disjoint). 4. Otherwise, recurse on step (3) with the slice of the list corresponding to our current prefix (i.e., the subset of patterns that have our prefix as a literal string prefix.) This algorithm is 'O(kn + n log(n))', where 'k' is max(len(pattern)) for each pattern in the list, and 'n' is len(patterns). By discovering this set of interesting patterns, we reduce the runtime of multi-pattern 'git for-each-ref' (and other ref traversals) from O(N) to O(n log(N)), where 'N' is the total number of packed references. Running 'git for-each-ref refs/tags/a refs/tags/b' on a repository with 10,000,000 refs in 'refs/tags/huge-N', my best-of-five times drop from: real 0m5.805s user 0m5.188s sys 0m0.468s to: real 0m0.001s user 0m0.000s sys 0m0.000s On linux.git, the times to dig out two of the latest -rc tags drops from 0.002s to 0.001s, so the change on repositories with fewer tags is much less noticeable. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-06-27 13:14:06 -07:00
Damien Robert	c646d0934e	ref-filter: use correct branch for %(push:track) In ref-filter.c, when processing the atom %(push:track), the ahead/behind values are computed using `stat_tracking_info` which refers to the upstream branch. Fix that by introducing a new flag `for_push` in `stat_tracking_info` in remote.c, which does the same thing but for the push branch. Update the few callers of `stat_tracking_info` to handle this flag. This ensure that whenever we use this function in the future, we are careful to specify is this should apply to the upstream or the push branch. This bug was not detected in t/t6300-for-each-ref.sh because in the test for push:track, both the upstream and the push branches were behind by 1 from the local branch. Change the test so that the upstream branch is behind by 1 while the push branch is ahead by 1. This allows us to test that %(push:track) refers to the correct branch. This changes the expected value of some following tests (by introducing new references), so update them too. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-18 09:17:41 +09:00
Olga Telezhnaya	5610d9ff0d	ref-filter: add tests for deltabase Test new formatting option deltabase. Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-12-28 10:08:11 -08:00
Olga Telezhnaya	f4ee22b526	ref-filter: add tests for objectsize:disk Test new formatting atom. Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-12-28 10:07:34 -08:00
Junio C Hamano	fba9654364	Merge branch 'jk/trailer-fixes' "git interpret-trailers" and its underlying machinery had a buggy code that attempted to ignore patch text after commit log message, which triggered in various codepaths that will always get the log message alone and never get such an input. * jk/trailer-fixes: append_signoff: use size_t for string offsets sequencer: ignore "---" divider when parsing trailers pretty, ref-filter: format %(trailers) with no_divider option interpret-trailers: allow suppressing "---" divider interpret-trailers: tighten check for "---" patch boundary trailer: pass process_trailer_opts to trailer_info_get() trailer: use size_t for iterating trailer list trailer: use size_t for string offsets	2018-09-17 13:53:54 -07:00
Jeff King	e5fba5d558	pretty, ref-filter: format %(trailers) with no_divider option In both of these cases we know that we are feeding the trailer-parsing code a pure commit message. We should tell it so, which avoids false positives for a commit message that contains a "---" line. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-23 10:08:51 -07:00
Ævar Arnfjörð Bjarmason	d3c6751b18	tests: make use of the test_must_be_empty function Change various tests that use an idiom of the form: >expect && test_cmp expect actual To instead use: test_must_be_empty actual The test_must_be_empty() wrapper was introduced in `ca8d148daf` ("test: test_must_be_empty helper", 2013-06-09). Many of these tests have been added after that time. This was mostly found with, and manually pruned from: git grep '^\s+>.expect. &&$' t Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-30 11:18:41 -07:00
Junio C Hamano	4301330588	Merge branch 'jk/for-each-ref-icase' The "--ignore-case" option of "git for-each-ref" (and its friends) did not work correctly, which has been fixed. * jk/for-each-ref-icase: ref-filter: avoid backend filtering with --ignore-case for-each-ref: consistently pass WM_IGNORECASE flag t6300: add a test for --ignore-case	2018-07-24 14:50:44 -07:00
Jeff King	e674eb2528	ref-filter: avoid backend filtering with --ignore-case When for-each-ref is used with --ignore-case, we expect match_name_as_path() to do a case-insensitive match. But there's an extra layer of filtering that happens before we even get there. Since commit `cfe004a5a9` (ref-filter: limit traversal to prefix, 2017-05-22), we feed the prefix to the ref backend so that it can optimize the ref iteration. There's no mechanism for us to tell the backend we're matching case-insensitively. Nor is there likely to be one anytime soon, since the packed backend relies on binary-searching the sorted list of refs. Let's just punt on this case. The extra filtering is an optimization that we simply can't do. We'll still give the correct answer via the filtering in match_name_as_path(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 14:49:37 -07:00
Jeff King	ee0f3e22c6	t6300: add a test for --ignore-case The --ignore-case option was added by `3bb16a8bf2` (tag, branch, for-each-ref: add --ignore-case for sorting and filtering, 2016-12-04), but it was never tested. And indeed, it does not work due to multiple bugs (which will be fixed in subsequent patches). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 14:49:13 -07:00

1 2 3

119 Commits