Most of our semantic patches in 'contrib/coccinelle/object_id.cocci'
turn calls of SHA1-specific functions into calls of their
corresponding object_id counterparts, e.g. sha1_to_hex() to
oid_to_hex(). These semantic patches look something like this:
@@
expression E1;
@@
- sha1_to_hex(E1.hash)
+ oid_to_hex(&E1)
and match the access to the 'hash' field in any data type, not only in
'struct object_id', and, consquently, can produce wrong
transformations.
Case in point is the recent hash function transition patch "rerere:
convert to use the_hash_algo" [1], which, among other things, renamed
'struct rerere_dir's 'sha1' field to 'hash', and then 'make
coccicheck' started to suggest the following wrong transformations for
'rerere.c' [2]:
- return sha1_to_hex(id->collection->hash);
+ return oid_to_hex(id->collection);
and
- DIR *dir = opendir(git_path("rr-cache/%s", sha1_to_hex(rr_dir->hash)));
+ DIR *dir = opendir(git_path("rr-cache/%s", oid_to_hex(rr_dir)));
Avoid such wrong transformations by tightening semantic patches in
'object_id.cocci' to match only type of or pointers to 'struct
object_id'.
[1] https://public-inbox.org/git/20181008215701.779099-15-sandals@crustytoothpaste.net/
[2] https://travis-ci.org/git/git/jobs/440463476#L580
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This rounds out the previous three patches, covering the
inequality logic for the "hash" variant of the functions.
As with the previous three, the accompanying code changes
are the mechanical result of applying the coccinelle patch;
see those patches for more discussion.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the flip side of the previous two patches: checking
for a non-zero oidcmp() can be more strictly expressed as
inequality. Like those patches, we write "!= 0" in the
coccinelle transformation, which covers by isomorphism the
more common:
if (oidcmp(E1, E2))
As with the previous two patches, this patch can be achieved
almost entirely by running "make coccicheck"; the only
differences are manual line-wrap fixes to match the original
code.
There is one thing to note for anybody replicating this,
though: coccinelle 1.0.4 seems to miss the case in
builtin/tag.c, even though it's basically the same as all
the others. Running with 1.0.7 does catch this, so
presumably it's just a coccinelle bug that was fixed in the
interim.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the partner patch to the previous one, but covering
the "hash" variants instead of "oid". Note that our
coccinelle rule is slightly more complex to avoid triggering
the call in hasheq().
I didn't bother to add a new rule to convert:
- hasheq(E1->hash, E2->hash)
+ oideq(E1, E2)
Since these are new functions, there won't be any such
existing callers. And since most of the code is already
using oideq, we're not likely to introduce new ones.
We might still see "!hashcmp(E1->hash, E2->hash)" from topics
in flight. But because our new rule comes after the existing
ones, that should first get converted to "!oidcmp(E1, E2)"
and then to "oideq(E1, E2)".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the more restrictive oideq() should, in the long run,
give the compiler more opportunities to optimize these
callsites. For now, this conversion should be a complete
noop with respect to the generated code.
The result is also perhaps a little more readable, as it
avoids the "zero is equal" idiom. Since it's so prevalent in
C, I think seasoned programmers tend not to even notice it
anymore, but it can sometimes make for awkward double
negations (e.g., we can drop a few !!oidcmp() instances
here).
This patch was generated almost entirely by the included
coccinelle patch. This mechanical conversion should be
completely safe, because we check explicitly for cases where
oidcmp() is compared to 0, which is what oideq() is doing
under the hood. Note that we don't have to catch "!oidcmp()"
separately; coccinelle's standard isomorphisms make sure the
two are treated equivalently.
I say "almost" because I did hand-edit the coccinelle output
to fix up a few style violations (it mostly keeps the
original formatting, but sometimes unwraps long lines).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes we want to suppress a coccinelle transformation
inside a particular function. For example, in finding
conversions of hashcmp() to oidcmp(), we should not convert
the call in oidcmp() itself, since that would cause infinite
recursion. We write that like this:
@@
identifier f != oidcmp;
expression E1, E2;
@@
f(...) {...
- hashcmp(E1->hash, E2->hash)
+ oidcmp(E1, E2)
...}
to match the interior of any function _except_ oidcmp().
Unfortunately, this doesn't catch all cases (e.g., the one
in sequencer.c that this patch fixes). The problem, as
explained by one of the Coccinelle developers in [1], is:
For transformation, A ... B requires that B occur on every
execution path starting with A, unless that execution path
ends up in error handling code. (eg, if (...) { ...
return; }). Here your A is the start of the function. So
you need a call to hashcmp on every path through the
function, which fails when you add ifs.
[...]
Another issue with A ... B is that by default A and B
should not appear in the matched region. So your original
rule matches only the case where every execution path
contains exactly one call to hashcmp, not more than one.
One way to solve this is to put the pattern inside an
angle-bracket pattern like "<... P ...>", which allows zero
or more matches of P. That works (and is what this patch
does), but it has one drawback: it matches more than we care
about, and Coccinelle uses extra CPU. Here are timings for
"make coccicheck" before and after this patch:
[before]
real 1m27.122s
user 7m34.451s
sys 0m37.330s
[after]
real 2m18.040s
user 10m58.310s
sys 0m41.549s
That's not ideal, but it's more important for this to be
correct than to be fast. And coccicheck is already fairly
slow (and people don't run it for every single patch). So
it's an acceptable tradeoff.
There _is_ a better way to do it, which is to record the
position at which we find hashcmp(), and then check it
against the forbidden function list. Like:
@@
position p : script:python() { p[0].current_element != "oidcmp" };
expression E1,E2;
@@
- hashcmp@p(E1->hash, E2->hash)
+ oidcmp(E1, E2)
This is only a little slower than the current code, and does
the right thing in all cases. Unfortunately, not all builds
of Coccinelle include python support (including the ones in
Debian). Requiring it may mean that fewer people can easily
run the tool, which is worse than it simply being a little
slower.
We may want to revisit this decision in the future if:
- builds with python become more common
- we find more uses for python support that tip the
cost-benefit analysis
But for now this patch sticks with the angle-bracket
solution, and converts all existing cocci patches. This
fixes only one missed case in the current code, though it
makes a much better difference for some new rules I'm adding
(converting "!hashcmp()" to "hasheq()" misses over half the
possible conversions using the old form).
[1] https://public-inbox.org/git/alpine.DEB.2.21.1808240652370.2344@hadrien/
Helped-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent patch series renamed the get_commit_tree_from_graph method but
forgot to update the coccinelle script that exempted it from rules
regarding accesses to 'maybe_tree'. This fixes that oversight to bring
the coccinelle scripts back to a good state.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The semantic patch 'contrib/coccinelle/commit.cocci' added in
2e27bd7731 (treewide: replace maybe_tree with accessor methods,
2018-04-06) is supposed to "ensure that all references to the
'maybe_tree' member of struct commit are either mutations or accesses
through get_commit_tree()". So get_commit_tree() clearly must be able
to directly access the 'maybe_tree' member, and 'commit.cocci' has a
bit of a roundabout workaround to ensure that get_commit_tree()'s
direct access in its return statement is not transformed: after all
references to 'maybe_tree' have been transformed to a call to
get_commit_tree(), including the reference in get_commit_tree()
itself, the last rule transforms back a 'return get_commit_tree()'
statement, back then found only in get_commit_tree() itself, to a
direct access.
Unfortunately, already the very next commit shows that this workaround
is insufficient: 7b8a21dba1 (commit-graph: lazy-load trees for
commits, 2018-04-06) extends get_commit_tree() with a condition
directly accessing the 'maybe_tree' member, and Coccinelle with
'commit.cocci' promptly detects it and suggests a transformation to
avoid it. This transformation is clearly wrong, because calling
get_commit_tree() to access 'maybe_tree' _in_ get_commit_tree() would
obviously lead to recursion. Furthermore, the same commit added
another, more specialized getter function get_commit_tree_in_graph(),
whose legitimate direct access to 'maybe_tree' triggers a similar
wrong transformation suggestion.
Exclude both of these getter functions from the general rule in
'commit.cocci' that matches their direct accesses to 'maybe_tree'.
Also exclude load_tree_for_commit(), which, as static helper funcion
of get_commit_tree_in_graph(), has legitimate direct access to
'maybe_tree' as well.
The last rule transforming back 'return get_commit_tree()' statements
to direct accesses thus became unnecessary, remove it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of making trees load lazily, create a Coccinelle
script (contrib/coccinelle/commit.cocci) to ensure that all
references to the 'maybe_tree' member of struct commit are either
mutations or accesses through get_commit_tree() or
get_commit_tree_oid().
Apply the Coccinelle script to create the rest of the patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
353d84c537 (coccicheck: make transformation for strbuf_addf(sb, "...")
more precise) added a check to avoid transforming calls with format
strings which contain percent signs, as that would change the result.
It uses embedded Python code for that. Simplify this rule by using the
regular expression matching operator instead.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a rule in strbuf.cocci for converting trivial uses of
strbuf_addf() to strbuf_addstr() in order to simplify the code and
improve performance a bit. Coccinelle 1.0.0~rc19.deb-3 on Travis CI
lets the "%s" in that rule match format strings like "%d" as well for
some reason, though, leading to invalid proposed patches.
Use the "format" keyword to let Coccinelle parse the format string and
match the conversion specifier with a trivial regular expression
instead. This works fine with both Coccinelle 1.0.0~rc19.deb-3 and
1.0.4.deb-3+b3 (the current version on Debian testing).
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Tested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Transformations that hide multiplications can end up with an pair of
parentheses that is no longer needed. E.g. with a rule like this:
@@
expression E;
@@
- E * 2
+ double(E)
... we might get a patch like this:
- x = (a + b) * 2;
+ x = double((a + b));
Add a pair of parentheses to the preimage side of such rules.
Coccinelle will generate patches that remove them if they are present,
and it will still match expressions that lack them.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to COPY_ARRAY (introduced in 60566cbb58), add a safe and
convenient helper for moving potentially overlapping ranges of array
entries. It infers the element size, multiplies automatically and
safely to get the size in bytes, does a basic type safety check by
comparing element sizes and unlike memmove(3) it supports NULL
pointers iff 0 elements are to be moved.
Also add a semantic patch to demonstrate the helper's intended usage.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two rules for using FREE_AND_NULL in free.cocci, one for
pointer types and one for expressions. Both cause coccinelle to remove
empty lines and even newline characters between replacements for some
reason; consecutive "free(x);/x=NULL;" sequences end up as multiple
FREE_AND_NULL calls on the same time.
Remove the type rule, as the expression rule already covers it, and
rearrange the lines of the latter to place the addition of FREE_AND_NULL
between the removals, which causes coccinelle to leave surrounding
whitespace untouched.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A follow-up to the existing "type" rule added in an earlier
change. This catches some occurrences that are missed by the previous
rule.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An helper function to make it easier to append the result from
real_path() to a strbuf has been added.
* rs/strbuf-add-real-path:
strbuf: add strbuf_add_real_path()
cocci: use ALLOC_ARRAY
Add a function for appending the canonized absolute pathname of a given
path to a strbuf. It keeps the existing contents intact, as expected of
a function of the strbuf_add() family, while avoiding copying the result
if the given strbuf is empty. It's more consistent with the rest of the
strbuf API than strbuf_realpath(), which it's wrapping.
Also add a semantic patch demonstrating its intended usage and apply it
to the current tree. Using strbuf_add_real_path() instead of calling
strbuf_addstr() and real_path() avoids an extra copy to a static buffer.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a semantic patch for using ALLOC_ARRAY to allocate arrays and apply
the transformation on the current source tree. The macro checks for
multiplication overflow and infers the element size automatically; the
result is shorter and safer code.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new coccinelle rule that catches a check of !pointer before the
pointer is free(3)d, which most likely is a bug.
* rs/cocci-check-free-only-null:
cocci: detect useless free(3) calls
Add a semantic patch for removing checks that cause free(3) to only be
called with a NULL pointer, as that must be a programming mistake.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a macro for exchanging the values of variables. It allows users
to avoid repetition and takes care of the temporary variable for them.
It also makes sure that the storage sizes of its two parameters are the
same. Its memcpy(1) calls are optimized away by current compilers.
Also add a conservative semantic patch for transforming only swaps of
variables of the same type.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function that returns a buffer containing the absolute path of its
argument and a semantic patch for its intended use. It avoids an extra
string copy to a static buffer.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the rule to convert "unsigned char [20]" into "struct
object_id *" in contrib/coccinelle/
* rs/cocci:
cocci: avoid self-references in object_id transformations
The object_id functions oid_to_hex, oid_to_hex_r, oidclr, oidcmp, and
oidcpy are defined as wrappers of their legacy counterparts sha1_to_hex,
sha1_to_hex_r, hashclr, hashcmp, and hashcpy, respectively. Make sure
that the Coccinelle transformations for converting legacy function calls
are not applied to these wrappers themselves, which would result in
tautological declarations.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup.
* rs/cocci:
use strbuf_add_unique_abbrev() for adding short hashes, part 3
remove unnecessary NULL check before free(3)
coccicheck: make transformation for strbuf_addf(sb, "...") more precise
use strbuf_add_unique_abbrev() for adding short hashes, part 2
use strbuf_addstr() instead of strbuf_addf() with "%s", part 2
gitignore: ignore output files of coccicheck make target
use strbuf_addstr() for adding constant strings to a strbuf, part 2
add coccicheck make target
contrib/coccinelle: fix semantic patch for oid_to_hex_r()
d64ea0f83b ("git-compat-util: add xstrdup_or_null helper",
2015-01-12) added a handy wrapper that allows us to get a duplicate
of a string or NULL if the original is NULL, but a handful of
codepath predate its introduction or just weren't aware of it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of
the time third parameter is redundant. A new QSORT() macro lets us
omit it.
* rs/qsort:
show-branch: use QSORT
use QSORT, part 2
coccicheck: use --all-includes by default
remove unnecessary check before QSORT
use QSORT
add QSORT
free(3) handles NULL pointers just fine. Add a semantic patch for
removing unnecessary NULL checks before calling this function, and
apply it on the code base.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up with help from coccinelle tool continues.
* rs/cocci:
coccicheck: make transformation for strbuf_addf(sb, "...") more precise
use strbuf_add_unique_abbrev() for adding short hashes, part 2
use strbuf_addstr() instead of strbuf_addf() with "%s", part 2
gitignore: ignore output files of coccicheck make target
We can replace strbuf_addf() calls that just add a simple string with
calls to strbuf_addstr() to make the intent clearer. We need to be
careful if that string contains printf format specifications like %%,
though, as a simple replacement would change the output.
Add checks to the semantic patch to make sure we only perform the
transformation if the second argument is a string constant (possibly
translated) that doesn't contain any percent signs.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a semantic patch for removing checks similar to the one that QSORT
already does internally and apply it to the code base.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
size of the array elements and supports the convention of initializing
empty arrays with a NULL pointer, which we use in some places.
Calling qsort(3) directly with a NULL pointer is undefined -- even with
an element count of zero -- and allows the compiler to optimize away any
following NULL checks. Using the macro avoids such surprises.
Add a semantic patch as well to demonstrate the macro's usage and to
automate the transformation of trivial cases.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
instead of taking detours through find_unique_abbrev() and its static
buffer. This is shorter and a bit more efficient.
1eb47f167d already converted six cases,
this patch covers three more.
A semantic patch for Coccinelle is included for easier checking for
new cases that might be introduced in the future.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace uses of strbuf_addf() for adding strings with more lightweight
strbuf_addstr() calls. This is shorter and makes the intent clearer.
bc57b9c0cc already converted three cases,
this patch covers two more.
A semantic patch for Coccinelle is included for easier checking for
new cases that might be introduced in the future.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a semantic patch for converting certain calls of memcpy(3) to
COPY_ARRAY() and apply that transformation to the code base. The result
is
shorter and safer code. For now only consider calls where source and
destination have the same type, or in other words: easy cases.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace uses of strbuf_addf() for adding strings with more lightweight
strbuf_addstr() calls. This makes the intent clearer and avoids
potential issues with printf format specifiers.
02962d3684 already converted six cases,
this patch covers eleven more.
A semantic patch for Coccinelle is included for easier checking for
new cases that might be introduced in the future.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both sha1_to_hex_r() and oid_to_hex_r() take two parameters, so use two
expressions in the semantic patch for transforming calls of the former
to the latter one.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Coccinelle (http://coccinelle.lip6.fr/) is a program which performs
mechanical transformations on C programs using semantic patches. These
semantic patches can be used to implement automatic refactoring and
maintenance tasks.
Add a set of basic semantic patches to convert common patterns related
to the struct object_id transformation, as well as a README.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>