During the development of chainlint.pl, numerous new self-tests were
created to verify correct functioning beyond the checks already
represented by the existing self-tests. The new checks fall into several
categories:
* behavior of the lexical analyzer for complex cases, such as line
splicing, token pasting, entering and exiting string contexts inside
and outside of test script bodies; for instance:
test_expect_success 'title' '
x=$(echo "something" |
sed -e '\''s/\\/\\\\/g'\'' -e '\''s/[[/.*^$]/\\&/g'\''
'
* behavior of the parser for all compound grammatical constructs, such
as `if...fi`, `case...esac`, `while...done`, `{...}`, etc., and for
other legal shell grammatical constructs not covered by existing
chainlint.sed self-tests, as well as complex cases, such as:
OUT=$( ((large_git 1>&3) | :) 3>&1 ) &&
* detection of problems, such as &&-chain breakage, from top-level to
any depth since the existing self-tests do not cover any top-level
context and only cover subshells one level deep due to limitations of
chainlint.sed
* address blind spots in chainlint.sed (such as not detecting a broken
&&-chain on a one-line for-loop in a subshell[1]) which chainlint.pl
correctly detects
* real-world cases which tripped up chainlint.pl during its development
[1]: https://lore.kernel.org/git/dce35a47012fecc6edc11c68e91dbb485c5bc36f.1661663880.git.gitgitgadget@gmail.com/
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that chainlint.pl is functional, take advantage of the existing
chainlint self-tests to validate its operation. (While at it, stop
validating chainlint.sed against the self-tests since it will soon be
retired.)
Due to chainlint.sed implementation limitations leaking into the
self-test "expect" files, a few of them require minor adjustment to make
them compatible with chainlint.pl which does not share those
limitations.
First, because `sed` does not provide any sort of real recursion,
chainlint.sed only emulates recursion into subshells, and each level of
recursion leads to a multiplicative increase in complexity of the `sed`
rules. To avoid substantial complexity, chainlint.sed, therefore, only
emulates subshell recursion one level deep. Any subshell deeper than
that is passed through as-is, which means that &&-chains are not checked
in deeper subshells. chainlint.pl, on the other hand, employs a proper
recursive descent parser, thus checks subshells to any depth and
correctly flags broken &&-chains in deep subshells.
Second, due to sed's line-oriented nature, chainlint.sed, by necessity,
folds multi-line quoted strings into a single line. chainlint.pl, on the
other hand, employs a proper lexical analyzer which preserves quoted
strings as-is, including embedded newlines.
Furthermore, the output of chainlint.sed and chainlint.pl do not match
precisely in terms of whitespace. However, since the purpose of the
self-checks is to verify that the ?!AMP?! annotations are being
correctly added, minor whitespace differences are immaterial. For this
reason, rather than adjusting whitespace in all existing self-test
"expect" files to match the new linter's output, the `check-chainlint`
target ignores whitespace differences. Since `diff -w` is not POSIX,
`check-chainlint` attempts to employ `git diff -w`, and only falls back
to non-POSIX `diff -w` (and `-u`) if `git diff` is not available.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
chainlint.sed inserts a ">" annotation at the beginning of a line to
signal that its heuristics have identified an end-of-subshell. This was
useful as a debugging aid during development of the script, but it has
no value to test writers and might even confuse them into thinking that
the linter is misbehaving by inserting line-noise into the shell code it
is validating. Moreover, its presence also potentially makes it
difficult to reuse the chainlint self-test "expect" output should a more
capable linter ever be developed. Therefore, drop the ">" annotation.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When chainlint.sed detects a broken &&-chain, it places an ?!AMP?!
annotation at the beginning of the line. However, this is an unusual
location for programmers accustomed to error messages (from compilers,
for instance) indicating the exact point of the problem. Therefore,
relocate the ?!AMP?! annotation to the end of the line in order to
better direct the programmer's attention to the source of the problem.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>