git-commit-vandalism/t/t0450-txt-doc-vs-help.sh

173 lines
3.6 KiB
Bash
Raw Normal View History

#!/bin/sh
tests: start asserting that *.txt SYNOPSIS matches -h output There's been a lot of incremental effort to make the SYNOPSIS output in our documentation consistent with the -h output, e.g. cbe485298bf (git reflog [expire|delete]: make -h output consistent with SYNOPSIS, 2022-03-17) is one recent example, but that effort has been an uphill battle due to the lack of regression testing. This adds such regression testing, we can parse out the SYNOPSIS output with "sed", and it turns out it's relatively easy to normalize it and the "-h" output to match on another. We now ensure that we won't have regressions when it comes to the list of commands in "expect_help_to_match_txt" below, and in subsequent commits we'll make more of them consistent. The naïve parser here gets quite a few things wrong, but it doesn't need to be perfect, just good enough that we can compare /some/ of this help output. There's no cases where the output would match except for the parser's stupidity, it's all cases of e.g. comparing the *.txt to non-parse_options() output. Since that output is wildly different than the *.txt anyway let's leave this for now, we can fix the parser some other time, or it won't become necessary as we'll e.g. convert more things to using parse_options(). Having a special-case for "merge-tree"'s 1f0c3a29da3 (merge-tree: implement real merges, 2022-06-18) is a bit ugly, but preferred to blessing that " (deprecated)" pattern for other commands. We'd probably want to add some other way of marking deprecated commands in the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is indistinguishable from the command taking an optional literal "deprecated" string as an argument. Some of the issues that are left: * "git show -h", "git whatchanged -h" and "git reflog --oneline -h" all showing "git log" and "git show" usage output. I.e. the "builtin_log_usage" in builtin/log.c doesn't take into account what command we're running. * Commands which implement subcommands such as like "multi-pack-index", "notes", "remote" etc. having their subcommands in a very different order in the *.txt and *.c. Fixing it would require some verbose diffs, so it's been left alone for now. * Commands such as "format-patch" have a very long argument list in the *.txt, but just "[<options>]" in the *.c. What to do about these has been left out of this series, except to the extent that preceding commits changed "[<options>]" (or equivalent) to the list of options in cases where that list of options was tiny, or we clearly meant to exhaustively list the options in both *.txt and *.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 17:39:27 +02:00
test_description='assert (unbuilt) Documentation/*.txt and -h output
Run this with --debug to see a summary of where we still fail to make
the two versions consistent with one another.'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'setup: list of builtins' '
git --list-cmds=builtins >builtins
'
tests: start asserting that *.txt SYNOPSIS matches -h output There's been a lot of incremental effort to make the SYNOPSIS output in our documentation consistent with the -h output, e.g. cbe485298bf (git reflog [expire|delete]: make -h output consistent with SYNOPSIS, 2022-03-17) is one recent example, but that effort has been an uphill battle due to the lack of regression testing. This adds such regression testing, we can parse out the SYNOPSIS output with "sed", and it turns out it's relatively easy to normalize it and the "-h" output to match on another. We now ensure that we won't have regressions when it comes to the list of commands in "expect_help_to_match_txt" below, and in subsequent commits we'll make more of them consistent. The naïve parser here gets quite a few things wrong, but it doesn't need to be perfect, just good enough that we can compare /some/ of this help output. There's no cases where the output would match except for the parser's stupidity, it's all cases of e.g. comparing the *.txt to non-parse_options() output. Since that output is wildly different than the *.txt anyway let's leave this for now, we can fix the parser some other time, or it won't become necessary as we'll e.g. convert more things to using parse_options(). Having a special-case for "merge-tree"'s 1f0c3a29da3 (merge-tree: implement real merges, 2022-06-18) is a bit ugly, but preferred to blessing that " (deprecated)" pattern for other commands. We'd probably want to add some other way of marking deprecated commands in the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is indistinguishable from the command taking an optional literal "deprecated" string as an argument. Some of the issues that are left: * "git show -h", "git whatchanged -h" and "git reflog --oneline -h" all showing "git log" and "git show" usage output. I.e. the "builtin_log_usage" in builtin/log.c doesn't take into account what command we're running. * Commands which implement subcommands such as like "multi-pack-index", "notes", "remote" etc. having their subcommands in a very different order in the *.txt and *.c. Fixing it would require some verbose diffs, so it's been left alone for now. * Commands such as "format-patch" have a very long argument list in the *.txt, but just "[<options>]" in the *.c. What to do about these has been left out of this series, except to the extent that preceding commits changed "[<options>]" (or equivalent) to the list of options in cases where that list of options was tiny, or we clearly meant to exhaustively list the options in both *.txt and *.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 17:39:27 +02:00
test_expect_success 'list of txt and help mismatches is sorted' '
sort -u "$TEST_DIRECTORY"/t0450/txt-help-mismatches >expect &&
if ! test_cmp expect "$TEST_DIRECTORY"/t0450/txt-help-mismatches
then
BUG "please keep the list of txt and help mismatches sorted"
fi
'
help_to_synopsis () {
builtin="$1" &&
out_dir="out/$builtin" &&
out="$out_dir/help.synopsis" &&
if test -f "$out"
then
echo "$out" &&
return 0
fi &&
mkdir -p "$out_dir" &&
test_expect_code 129 git $builtin -h >"$out.raw" 2>&1 &&
sed -n \
-e '1,/^$/ {
/^$/d;
s/^usage: //;
s/^ *or: //;
p;
}' <"$out.raw" >"$out" &&
echo "$out"
}
builtin_to_txt () {
echo "$GIT_BUILD_DIR/Documentation/git-$1.txt"
}
txt_to_synopsis () {
builtin="$1" &&
out_dir="out/$builtin" &&
out="$out_dir/txt.synopsis" &&
if test -f "$out"
then
echo "$out" &&
return 0
fi &&
b2t="$(builtin_to_txt "$builtin")" &&
sed -n \
-e '/^\[verse\]$/,/^$/ {
/^$/d;
/^\[verse\]$/d;
tests: start asserting that *.txt SYNOPSIS matches -h output There's been a lot of incremental effort to make the SYNOPSIS output in our documentation consistent with the -h output, e.g. cbe485298bf (git reflog [expire|delete]: make -h output consistent with SYNOPSIS, 2022-03-17) is one recent example, but that effort has been an uphill battle due to the lack of regression testing. This adds such regression testing, we can parse out the SYNOPSIS output with "sed", and it turns out it's relatively easy to normalize it and the "-h" output to match on another. We now ensure that we won't have regressions when it comes to the list of commands in "expect_help_to_match_txt" below, and in subsequent commits we'll make more of them consistent. The naïve parser here gets quite a few things wrong, but it doesn't need to be perfect, just good enough that we can compare /some/ of this help output. There's no cases where the output would match except for the parser's stupidity, it's all cases of e.g. comparing the *.txt to non-parse_options() output. Since that output is wildly different than the *.txt anyway let's leave this for now, we can fix the parser some other time, or it won't become necessary as we'll e.g. convert more things to using parse_options(). Having a special-case for "merge-tree"'s 1f0c3a29da3 (merge-tree: implement real merges, 2022-06-18) is a bit ugly, but preferred to blessing that " (deprecated)" pattern for other commands. We'd probably want to add some other way of marking deprecated commands in the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is indistinguishable from the command taking an optional literal "deprecated" string as an argument. Some of the issues that are left: * "git show -h", "git whatchanged -h" and "git reflog --oneline -h" all showing "git log" and "git show" usage output. I.e. the "builtin_log_usage" in builtin/log.c doesn't take into account what command we're running. * Commands which implement subcommands such as like "multi-pack-index", "notes", "remote" etc. having their subcommands in a very different order in the *.txt and *.c. Fixing it would require some verbose diffs, so it's been left alone for now. * Commands such as "format-patch" have a very long argument list in the *.txt, but just "[<options>]" in the *.c. What to do about these has been left out of this series, except to the extent that preceding commits changed "[<options>]" (or equivalent) to the list of options in cases where that list of options was tiny, or we clearly meant to exhaustively list the options in both *.txt and *.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 17:39:27 +02:00
s/{litdd}/--/g;
s/'\''\(git[ a-z-]*\)'\''/\1/g;
p;
}' \
<"$b2t" >"$out" &&
echo "$out"
}
check_dashed_labels () {
! grep -E "<[^>_-]+_" "$1"
}
HT=" "
tests: start asserting that *.txt SYNOPSIS matches -h output There's been a lot of incremental effort to make the SYNOPSIS output in our documentation consistent with the -h output, e.g. cbe485298bf (git reflog [expire|delete]: make -h output consistent with SYNOPSIS, 2022-03-17) is one recent example, but that effort has been an uphill battle due to the lack of regression testing. This adds such regression testing, we can parse out the SYNOPSIS output with "sed", and it turns out it's relatively easy to normalize it and the "-h" output to match on another. We now ensure that we won't have regressions when it comes to the list of commands in "expect_help_to_match_txt" below, and in subsequent commits we'll make more of them consistent. The naïve parser here gets quite a few things wrong, but it doesn't need to be perfect, just good enough that we can compare /some/ of this help output. There's no cases where the output would match except for the parser's stupidity, it's all cases of e.g. comparing the *.txt to non-parse_options() output. Since that output is wildly different than the *.txt anyway let's leave this for now, we can fix the parser some other time, or it won't become necessary as we'll e.g. convert more things to using parse_options(). Having a special-case for "merge-tree"'s 1f0c3a29da3 (merge-tree: implement real merges, 2022-06-18) is a bit ugly, but preferred to blessing that " (deprecated)" pattern for other commands. We'd probably want to add some other way of marking deprecated commands in the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is indistinguishable from the command taking an optional literal "deprecated" string as an argument. Some of the issues that are left: * "git show -h", "git whatchanged -h" and "git reflog --oneline -h" all showing "git log" and "git show" usage output. I.e. the "builtin_log_usage" in builtin/log.c doesn't take into account what command we're running. * Commands which implement subcommands such as like "multi-pack-index", "notes", "remote" etc. having their subcommands in a very different order in the *.txt and *.c. Fixing it would require some verbose diffs, so it's been left alone for now. * Commands such as "format-patch" have a very long argument list in the *.txt, but just "[<options>]" in the *.c. What to do about these has been left out of this series, except to the extent that preceding commits changed "[<options>]" (or equivalent) to the list of options in cases where that list of options was tiny, or we clearly meant to exhaustively list the options in both *.txt and *.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 17:39:27 +02:00
align_after_nl () {
builtin="$1" &&
len=$(printf "git %s " "$builtin" | wc -c) &&
pad=$(printf "%${len}s" "") &&
sed "s/^[ $HT][ $HT]*/$pad/"
}
test_debug '>failing'
while read builtin
do
# -h output assertions
test_expect_success "$builtin -h output has no \t" '
h2s="$(help_to_synopsis "$builtin")" &&
! grep "$HT" "$h2s"
'
test_expect_success "$builtin -h output has dashed labels" '
check_dashed_labels "$(help_to_synopsis "$builtin")"
'
tests: assert consistent whitespace in -h output Add a test for the *.txt and *.c output assertions which asserts that for "-h" lines that aren't the "usage: " or " or: " lines they start with the same amount of whitespace. This ensures that we won't have buggy output like: [...] or: git tag [-n[<num>]] [...] [--create-reflog] [...] Which should instead be like this, i.e. the options lines should be aligned: [...] or: git tag [-n[<num>]] [...] [--create-reflog] [...] It would be better to be able to use "test_cmp" here, i.e. to construct the output we expect, and compare it against the actual output. For most built-in commands this would be rather straightforward. In "t0450-txt-doc-vs-help.sh" we already compute the whitespace that a "git-$builtin" needs, and strip away "usage: " or " or: " from the start of lines. The problem is: * For commands that implement subcommands, such as "git bundle", we don't know whether e.g. "git bundle create" is the subcommand "create", or the argument "create" to "bundle" for the purposes of alignment. We *do* have that information from the *.txt version, since the part within the ''-quotes should be the command & subcommand, but that isn't consistent (e.g. see "git bundle" and "git commit-graph", only the latter is correct), and parsing that out would be non-trivial. * If we were to make this stricter we have various non-parse_options() users (e.g. "git diff-tree") that don't have the nicely aligned output which we've had since 4631cfc20bd (parse-options: properly align continued usage output, 2021-09-21). So rather than make perfect the enemy of the good let's assert that for those lines that are indented they should all use the same indentation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 17:39:28 +02:00
test_expect_success "$builtin -h output has consistent spacing" '
h2s="$(help_to_synopsis "$builtin")" &&
sed -n \
-e "/^ / {
s/[^ ].*//;
p;
}" \
<"$h2s" >help &&
sort -u help >help.ws &&
if test -s help.ws
then
test_line_count = 1 help.ws
fi
'
txt="$(builtin_to_txt "$builtin")" &&
preq="$(echo BUILTIN_TXT_$builtin | tr '[:lower:]-' '[:upper:]_')" &&
if test -f "$txt"
then
test_set_prereq "$preq"
fi &&
# *.txt output assertions
test_expect_success "$preq" "$builtin *.txt SYNOPSIS has dashed labels" '
check_dashed_labels "$(txt_to_synopsis "$builtin")"
'
tests: start asserting that *.txt SYNOPSIS matches -h output There's been a lot of incremental effort to make the SYNOPSIS output in our documentation consistent with the -h output, e.g. cbe485298bf (git reflog [expire|delete]: make -h output consistent with SYNOPSIS, 2022-03-17) is one recent example, but that effort has been an uphill battle due to the lack of regression testing. This adds such regression testing, we can parse out the SYNOPSIS output with "sed", and it turns out it's relatively easy to normalize it and the "-h" output to match on another. We now ensure that we won't have regressions when it comes to the list of commands in "expect_help_to_match_txt" below, and in subsequent commits we'll make more of them consistent. The naïve parser here gets quite a few things wrong, but it doesn't need to be perfect, just good enough that we can compare /some/ of this help output. There's no cases where the output would match except for the parser's stupidity, it's all cases of e.g. comparing the *.txt to non-parse_options() output. Since that output is wildly different than the *.txt anyway let's leave this for now, we can fix the parser some other time, or it won't become necessary as we'll e.g. convert more things to using parse_options(). Having a special-case for "merge-tree"'s 1f0c3a29da3 (merge-tree: implement real merges, 2022-06-18) is a bit ugly, but preferred to blessing that " (deprecated)" pattern for other commands. We'd probably want to add some other way of marking deprecated commands in the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is indistinguishable from the command taking an optional literal "deprecated" string as an argument. Some of the issues that are left: * "git show -h", "git whatchanged -h" and "git reflog --oneline -h" all showing "git log" and "git show" usage output. I.e. the "builtin_log_usage" in builtin/log.c doesn't take into account what command we're running. * Commands which implement subcommands such as like "multi-pack-index", "notes", "remote" etc. having their subcommands in a very different order in the *.txt and *.c. Fixing it would require some verbose diffs, so it's been left alone for now. * Commands such as "format-patch" have a very long argument list in the *.txt, but just "[<options>]" in the *.c. What to do about these has been left out of this series, except to the extent that preceding commits changed "[<options>]" (or equivalent) to the list of options in cases where that list of options was tiny, or we clearly meant to exhaustively list the options in both *.txt and *.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 17:39:27 +02:00
# *.txt output consistency assertions
result=
if grep -q "^$builtin$" "$TEST_DIRECTORY"/t0450/txt-help-mismatches
then
result=failure
else
result=success
fi &&
test_expect_$result "$preq" "$builtin -h output and SYNOPSIS agree" '
t2s="$(txt_to_synopsis "$builtin")" &&
if test "$builtin" = "merge-tree"
then
test_when_finished "rm -f t2s.new" &&
sed -e '\''s/ (deprecated)$//g'\'' <"$t2s" >t2s.new
t2s=t2s.new
fi &&
h2s="$(help_to_synopsis "$builtin")" &&
# The *.txt and -h use different spacing for the
# alignment of continued usage output, normalize it.
align_after_nl "$builtin" <"$t2s" >txt &&
align_after_nl "$builtin" <"$h2s" >help &&
test_cmp txt help
'
if test_have_prereq "$preq" && test -e txt && test -e help
then
test_debug '
if test_cmp txt help >cmp 2>/dev/null
then
echo "=== DONE: $builtin ==="
else
echo "=== TODO: $builtin ===" &&
cat cmp
fi >>failing
'
# Not in test_expect_success in case --run is being
# used with --debug
rm -f txt help tmp 2>/dev/null
fi
done <builtins
tests: start asserting that *.txt SYNOPSIS matches -h output There's been a lot of incremental effort to make the SYNOPSIS output in our documentation consistent with the -h output, e.g. cbe485298bf (git reflog [expire|delete]: make -h output consistent with SYNOPSIS, 2022-03-17) is one recent example, but that effort has been an uphill battle due to the lack of regression testing. This adds such regression testing, we can parse out the SYNOPSIS output with "sed", and it turns out it's relatively easy to normalize it and the "-h" output to match on another. We now ensure that we won't have regressions when it comes to the list of commands in "expect_help_to_match_txt" below, and in subsequent commits we'll make more of them consistent. The naïve parser here gets quite a few things wrong, but it doesn't need to be perfect, just good enough that we can compare /some/ of this help output. There's no cases where the output would match except for the parser's stupidity, it's all cases of e.g. comparing the *.txt to non-parse_options() output. Since that output is wildly different than the *.txt anyway let's leave this for now, we can fix the parser some other time, or it won't become necessary as we'll e.g. convert more things to using parse_options(). Having a special-case for "merge-tree"'s 1f0c3a29da3 (merge-tree: implement real merges, 2022-06-18) is a bit ugly, but preferred to blessing that " (deprecated)" pattern for other commands. We'd probably want to add some other way of marking deprecated commands in the SYNOPSIS syntax. Syntactically 1f0c3a29da3's way of doing it is indistinguishable from the command taking an optional literal "deprecated" string as an argument. Some of the issues that are left: * "git show -h", "git whatchanged -h" and "git reflog --oneline -h" all showing "git log" and "git show" usage output. I.e. the "builtin_log_usage" in builtin/log.c doesn't take into account what command we're running. * Commands which implement subcommands such as like "multi-pack-index", "notes", "remote" etc. having their subcommands in a very different order in the *.txt and *.c. Fixing it would require some verbose diffs, so it's been left alone for now. * Commands such as "format-patch" have a very long argument list in the *.txt, but just "[<options>]" in the *.c. What to do about these has been left out of this series, except to the extent that preceding commits changed "[<options>]" (or equivalent) to the list of options in cases where that list of options was tiny, or we clearly meant to exhaustively list the options in both *.txt and *.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 17:39:27 +02:00
test_debug 'say "$(cat failing)"'
test_done