git-commit-vandalism/contrib/coccinelle
Ævar Arnfjörð Bjarmason 6fae3aaf22 spatchcache: add a ccache-alike for "spatch"
Add a rather trivial "spatchcache", with this running e.g.:

	make cocciclean
	make contrib/coccinelle/free.cocci.patch \
		SPATCH=contrib/coccicheck/spatchcache \
		SPATCH_FLAGS=--very-quiet

Is cut down from ~20s to ~5s on my system. Much of that is either
fixable shell overhead, or the around 40 files we "CANTCACHE" (see the
implementation).

This uses "redis" as a cache by default, but it's configurable. See
the embedded documentation.

This is *not* like ccache in that we won't cache failed spatch
invocations, or those where spatch suggests changes for us. Those
cases are so rare that I didn't think it was worth the bother, by far
the most common case is that it has no suggested changes. We'll also
refuse to cache any "spatch" invocation that has output on stderr,
which means that "--very-quiet" must be added to "SPATCH_FLAGS".

Because we narrow the cache to that we don't need to save away stdout,
stderr & the exit code. We simply cache the cases where we had no
suggested changes.

Another benchmark is to compare this with the previous
SPATCH_BATCH_SIZE=N, as noted in [1]. Before this (on my 8 core system) running:

	make clean; time make contrib/coccinelle/array.cocci.patch SPATCH_BATCH_SIZE=0

Would take 33s, but with the preceding changes running without this
"spatchcache" is slightly slower, or around 35s:

	make clean; time make contrib/coccinelle/array.cocci.patch

Now doing the same with SPATCH=contrib/coccinelle/spatchcache will
take around 6s, but we'll need to compile the *.o files first to take
full advantage of it (which can be fast with "ccache"):

	make clean; make; time make contrib/coccinelle/array.cocci.patch SPATCH=contrib/coccinelle/spatchcache

1. https://lore.kernel.org/git/YwdRqP1CyUAzCEn2@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:22:16 -04:00
..
tests cocci: generalize "unused" rule to cover more than "strbuf" 2022-07-06 12:24:43 -07:00
.gitignore cocci: make "coccicheck" rule incremental 2022-11-02 21:22:16 -04:00
array.cocci cocci: avoid normalization rules for memcpy 2022-07-10 14:52:05 -07:00
commit.cocci commit: move members graph_pos, generation to a slab 2020-06-17 14:37:30 -07:00
equals-null.cocci contrib/coccinnelle: add equals-null.cocci 2022-05-02 09:47:55 -07:00
flex_alloc.cocci cocci: FLEX_ALLOC_MEM to FLEX_ALLOC_STR 2019-04-04 18:22:30 +09:00
free.cocci cocci: add and apply free_commit_list() rules 2022-04-13 23:56:08 -07:00
hashmap.cocci cocci rules: remove <id>'s from rules that don't need them 2022-11-02 21:22:16 -04:00
object_id.cocci cocci: retire is_null_sha1() rule 2022-06-07 15:53:24 -07:00
preincr.cocci cocci rules: remove <id>'s from rules that don't need them 2022-11-02 21:22:16 -04:00
qsort.cocci remove unnecessary check before QSORT 2016-09-29 15:42:18 -07:00
README spatchcache: add a ccache-alike for "spatch" 2022-11-02 21:22:16 -04:00
spatchcache spatchcache: add a ccache-alike for "spatch" 2022-11-02 21:22:16 -04:00
strbuf.cocci cocci rules: remove <id>'s from rules that don't need them 2022-11-02 21:22:16 -04:00
swap.cocci cocci rules: remove <id>'s from rules that don't need them 2022-11-02 21:22:16 -04:00
the_repository.pending.cocci cocci rules: remove unused "F" metavariable from pending rule 2022-11-02 21:22:15 -04:00
unused.cocci cocci: generalize "unused" rule to cover more than "strbuf" 2022-07-06 12:24:43 -07:00
xcalloc.cocci fix xcalloc() argument order 2021-03-08 09:45:04 -08:00
xopen.cocci index-pack: use xopen in init_thread 2021-09-10 14:22:50 -07:00
xstrdup_or_null.cocci cocci: drop bogus xstrdup_or_null() rule 2022-04-30 22:23:11 -07:00

This directory provides examples of Coccinelle (http://coccinelle.lip6.fr/)
semantic patches that might be useful to developers.

There are two types of semantic patches:

 * Using the semantic transformation to check for bad patterns in the code;
   The target 'make coccicheck' is designed to check for these patterns and
   it is expected that any resulting patch indicates a regression.
   The patches resulting from 'make coccicheck' are small and infrequent,
   so once they are found, they can be sent to the mailing list as per usual.

   Example for introducing new patterns:
   67947c34ae (convert "hashcmp() != 0" to "!hasheq()", 2018-08-28)
   b84c783882 (fsck: s/++i > 1/i++/, 2018-10-24)

   Example of fixes using this approach:
   248f66ed8e (run-command: use strbuf_addstr() for adding a string to
               a strbuf, 2018-03-25)
   f919ffebed (Use MOVE_ARRAY, 2018-01-22)

   These types of semantic patches are usually part of testing, c.f.
   0860a7641b (travis-ci: fail if Coccinelle static analysis found something
               to transform, 2018-07-23)

 * Using semantic transformations in large scale refactorings throughout
   the code base.

   When applying the semantic patch into a real patch, sending it to the
   mailing list in the usual way, such a patch would be expected to have a
   lot of textual and semantic conflicts as such large scale refactorings
   change function signatures that are used widely in the code base.
   A textual conflict would arise if surrounding code near any call of such
   function changes. A semantic conflict arises when other patch series in
   flight introduce calls to such functions.

   So to aid these large scale refactorings, semantic patches can be used.
   However we do not want to store them in the same place as the checks for
   bad patterns, as then automated builds would fail.
   That is why semantic patches 'contrib/coccinelle/*.pending.cocci'
   are ignored for checks, and can be applied using 'make coccicheck-pending'.

   This allows to expose plans of pending large scale refactorings without
   impacting the bad pattern checks.

Git-specific tips & things to know about how we run "spatch":

 * The "make coccicheck" will piggy-back on
   "COMPUTE_HEADER_DEPENDENCIES". If you've built a given object file
   the "coccicheck" target will consider its depednency to decide if
   it needs to re-run on the corresponding source file.

   This means that a "make coccicheck" will re-compile object files
   before running. This might be unexpected, but speeds up the run in
   the common case, as e.g. a change to "column.h" won't require all
   coccinelle rules to be re-run against "grep.c" (or another file
   that happens not to use "column.h").

   To disable this behavior use the "SPATCH_USE_O_DEPENDENCIES=NoThanks"
   flag.

 * To speed up our rules the "make coccicheck" target will by default
   concatenate all of the *.cocci files here into an "ALL.cocci", and
   apply it to each source file.

   This makes the run faster, as we don't need to run each rule
   against each source file. See the Makefile for further discussion,
   this behavior can be disabled with "SPATCH_CONCAT_COCCI=".

   But since they're concatenated any <id> in the <rulname> (e.g. "@
   my_name", v.s. anonymous "@@") needs to be unique across all our
   *.cocci files. You should only need to name rules if other rules
   depend on them (currently only one rule is named).

 * To speed up incremental runs even more use the "spatchcache" tool
   in this directory as your "SPATCH". It aimns to be a "ccache" for
   coccinelle, and piggy-backs on "COMPUTE_HEADER_DEPENDENCIES".

   It caches in Redis by default, see it source for a how-to.

   In one setup with a primed cache "make coccicheck" followed by a
   "make clean && make" takes around 10s to run, but 2m30s with the
   default of "SPATCH_CONCAT_COCCI=Y".

   With "SPATCH_CONCAT_COCCI=" the total runtime is around ~6m, sped
   up to ~1m with "spatchcache".

   Most of the 10s (or ~1m) being spent on re-running "spatch" on
   files we couldn't cache, as we didn't compile them (in contrib/*
   and compat/* mostly).

   The absolute times will differ for you, but the relative speedup
   from caching should be on that order.