Continue fleshing out chainlint.pl by adding TestParser, a parser with
special knowledge about how Git tests should be written; for instance,
it knows that commands within a test body should be chained together
with `&&`. An upcoming parser which plucks test definitions from test
scripts will invoke TestParser for each test body it encounters.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue fleshing out chainlint.pl by adding a general purpose recursive
descent parser for the POSIX shell command language. Although never
invoked directly, upcoming parser subclasses will extend its
functionality for specific purposes, such as plucking test definitions
from input scripts and applying domain-specific knowledge to perform
test validation.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Begin fleshing out chainlint.pl by adding a lexical analyzer for the
POSIX shell command language. The sole entry point Lexer::scan_token()
returns the next token from the input. It will be called by the upcoming
shell language parser.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although chainlint.sed usefully identifies broken &&-chains in tests, it
has several shortcomings which include:
* only detects &&-chain breakage in subshells (one-level deep)
* does not check for broken top-level &&-chains; that task is left to
the "magic exit code 117" checker built into test-lib.sh, however,
that detection does not extend to `{...}` blocks, `$(...)`
expressions, or compound statements such as `if...fi`,
`while...done`, `case...esac`
* uses heuristics, which makes it (potentially) fallible and difficult
to tweak to handle additional real-world cases
* written in `sed` and employs advanced `sed` operators which are
probably not well-known to many programmers, thus the pool of people
who can maintain it is likely small
* manually simulates recursion into subshells which makes it much more
difficult to reason about than, say, a traditional top-down parser
* checks each test as the test is run, which can get expensive for
tests which are run repeatedly by functions or loops since their
bodies will be checked over and over (tens or hundreds of times)
unnecessarily
To address these shortcomings, begin implementing a more functional and
precise test linter which understands shell syntax and semantics rather
than employing heuristics, thus is able to recognize structural problems
with tests beyond broken &&-chains.
The new linter is written in Perl, thus should be more accessible to a
wider audience, and is structured as a traditional top-down parser which
makes it much easier to reason about, and allows it to inspect compound
statements within test bodies to any depth.
Furthermore, it can check all test definitions in the entire project in
a single invocation rather than having to be invoked once per test, and
each test definition is checked only once no matter how many times the
test is actually run.
At this stage, the new linter is just a skeleton containing boilerplate
which handles command-line options, collects and reports statistics, and
feeds its arguments -- paths of test scripts -- to a (presently)
do-nothing script parser for validation. Subsequent changes will flesh
out the functionality.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>