Commit Graph

52748 Commits

Author SHA1 Message Date
Eric Sunshine
878f988350 t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
The --chain-lint option detects broken &&-chains by forcing the test to
exit early (as the very first step) with a sentinel value. If that
sentinel is the test's overall exit code, then the &&-chain is intact;
if not, then the chain is broken. Unfortunately, this detection does not
extend to &&-chains within subshells even when the subshell itself is
properly linked into the outer &&-chain.

Address this shortcoming by feeding the body of the test to a
lightweight "linter" which can peer inside subshells and identify broken
&&-chains by pure textual inspection. Although the linter does not
actually parse shell scripts, it has enough knowledge of shell syntax to
reliably deal with formatting style variations (as evolved over the
years) and to avoid being fooled by non-shell content (such as inside
here-docs and multi-line strings). It recognizes modern subshell
formatting:

    statement1 &&
    (
        statement2 &&
        statement3
    ) &&
    statement4

as well as old-style:

    statement1 &&
    (statement2 &&
     statement3) &&
    statement4

Heuristics are employed to properly identify the extent of a subshell
formatted in the old-style since a number of legitimate constructs may
superficially appear to close the subshell even though they don't. For
example, it understands that neither "x=$(command)" nor "case $x in *)"
end a subshell, despite the ")" at the end of line.

Due to limitations of the tool used ('sed') and its inherent
line-by-line processing, only subshells one level deep are handled, as
well as one-liner subshells one level below that. Subshells deeper than
that or multi-line subshells at level two are passed through as-is, thus
&&-chains in their bodies are not checked.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 09:15:14 -07:00
SZEDER Gábor
9500526284 t5608: fix broken &&-chain
This was missed by the previous clean-ups.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 09:12:59 -07:00
Jules Maselbas
636f3d7ac5 send-email: fix tls AUTH when sending batch
The variable smtp_encryption must keep it's value between two batches.
Otherwise the authentication is skipped after the first batch.

Signed-off-by: Jules Maselbas <jules.maselbas@grenoble-inp.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 15:02:36 -07:00
Eric Sunshine
a0a630192d t/check-non-portable-shell: detect "FOO=bar shell_func"
One-shot environment variable assignments, such as 'FOO' in
"FOO=bar cmd", exist only during the invocation of 'cmd'. However, if
'cmd' happens to be a shell function, then 'FOO' is assigned in the
executing shell itself, and that assignment remains until the process
exits (unless explicitly unset). Since this side-effect of
"FOO=bar shell_func" is unlikely to be intentional, detect and report
such usage.

To distinguish shell functions from other commands, perform a pre-scan
of shell scripts named as input, gleaning a list of function names by
recognizing lines of the form (loosely matching whitespace):

    shell_func () {

and later report suspect lines of the form (loosely matching quoted
values):

    FOO=bar [BAR=foo ...] shell_func

Also take care to stitch together incomplete lines (those ending with
"\") since suspect invocations may be split over multiple lines:

    FOO=bar BAR=foo \
    shell_func

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Eric Sunshine
c433600593 t/check-non-portable-shell: make error messages more compact
Error messages emitted by this linting script are long and noisy,
consisting of several sections:

    <test-script>:<line#>: error: <explanation>: <failed-shell-text>

The line of failed shell text, usually coming from within a test body,
is often indented by one or two TABs, with the result that the actual
(important) text is separated from <explanation> by a good deal of empty
space. This can make for a difficult read, especially on typical
80-column terminals.

Make the messages more compact and perhaps a bit easier to digest by
folding out the leading whitespace from <failed-shell-text>.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Eric Sunshine
ef2d2accef t/check-non-portable-shell: stop being so polite
Error messages emitted by this linting script are long and noisy,
consisting of several sections:

    <test-script>:<line#>: error: <explanation>: <failed-shell-text>

Many problem explanations ask the reader to "please" use a suggested
alternative, however, such politeness is unnecessary and just adds to
the noise and length of the line, so drop "please" to make the message a
bit more concise.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Eric Sunshine
079b087c8e t6046/t9833: fix use of "VAR=VAL cmd" with a shell function
Unlike "FOO=bar cmd" one-shot environment variable assignments
which exist only for the invocation of 'cmd', those assigned by
"FOO=bar shell_func" exist within the running shell and continue to
do so until the process exits (or are explicitly unset). It is
unlikely that this behavior was intended by the test author.

In these particular tests, the "FOO=bar shell_func" invocations are
already in subshells, so the assignments don't last too long, don't
appear to harm subsequent commands in the same subshells, and don't
affect other tests in the same scripts, however, the usage is
nevertheless misleading and poor practice, so fix the tests to assign
and export the environment variables in the usual fashion.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:55:01 -07:00
Junio C Hamano
f44a7442f6 Merge branch 'jc/t3404-one-shot-export-fix' into es/test-lint-one-shot-export
* jc/t3404-one-shot-export-fix:
  t3404: fix use of "VAR=VAL cmd" with a shell function
2018-07-16 14:54:55 -07:00
Jonathan Tan
42cc7485a2 negotiator/skipping: skip commits during fetch
Introduce a new negotiation algorithm used during fetch that skips
commits in an effort to find common ancestors faster. The skips grow
similarly to the Fibonacci sequence as the commit walk proceeds further
away from the tips. The skips may cause unnecessary commits to be
included in the packfile, but the negotiation step typically ends more
quickly.

Usage of this algorithm is guarded behind the configuration flag
fetch.negotiationAlgorithm.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:51:12 -07:00
Eric Sunshine
f9f7c116a3 t9119: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
cff4243db9 t9000-t9999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
e974e06de0 t7000-t7999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
c8ce3763ff t6000-t6999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
51b85471af t5000-t5999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
f957f03b60 t4000-t4999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
b6c32f63f3 t3030: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
3ea6737993 t3000-t3999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
2c2d0f9f47 t2000-t2999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
f2deabfcb6 t1000-t1999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
75651fd783 t0000-t0999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
794165cb17 t9814: simplify convoluted check that command correctly errors out
This test uses a convoluted method to verify that "p4 help" errors
out when asked for help about an unknown command. In doing so, it
intentionally breaks the &&-chain. Simplify by employing the typical
"! command" idiom and a normal &&-chain instead.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
be8c48d4c4 t9001: fix broken "invoke hook" test
This test has been dysfunctional since it was added by 6489660b4b
(send-email: support validate hook, 2017-05-12), however, the problem
went unnoticed due to a broken &&-chain late in the test.

The test wants to verify that a non-zero exit code from the
'sendemail-validate' hook causes git-send-email to abort with a
particular error message. A command which is expected to fail should be
run with 'test_must_fail', however, the test neglects to do so.

Fix this problem, as well as the broken &&-chain behind which the
problem hid.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
d964def526 t7810: use test_expect_code() instead of hand-rolled comparison
This test manually checks the exit code of git-grep for a particular
value. In doing so, it intentionally breaks the &&-chain. Modernize the
test by taking advantage of test_expect_code() and a normal &&-chain.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Eric Sunshine
adc73318fe t7400: fix broken "submodule add/reconfigure --force" test
This test has been dysfunctional since it was added by 619acfc78c
(submodule add: extend force flag to add existing repos, 2016-10-06),
however, two problems early in the test went unnoticed due to a broken
&&-chain later in the test.

First, it tries configuring the submodule with repository "bogus-url",
however, "git submodule add" insists that the repository be either an
absolute URL or a relative pathname requiring prefix "./" or "../" (this
is true even with --force), but "bogus-url" does not meet those
criteria, thus the command fails.

Second, it then tries configuring a submodule with a path which is
.gitignore'd, which is disallowed. This restriction can be overridden
with --force, but the test neglects to use that option.

Fix both problems, as well as the broken &&-chain behind which they hid.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Johannes Schindelin
ede8d89bb1 vcbuild/README: update to accommodate for missing common-cmds.h
In 60f487ac0e (Remove common-cmds.h, 2018-05-10), we forgot to adjust
this README when removing the common-cmds.h file.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:30:34 -07:00
brian m. carlson
580f0980e1 pretty: switch hard-coded constants to the_hash_algo
Switch several hard-coded constants into expressions based either on
GIT_MAX_HEXSZ or the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:40 -07:00
brian m. carlson
94b5e093f9 sha1-file: convert constants to uses of the_hash_algo
Convert one use of 20 and several uses of GIT_SHA1_HEXSZ into references
to the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:40 -07:00
brian m. carlson
2ed1960a77 log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
02afca1ee4 diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
b7f20f7204 builtin/merge-recursive: make hash independent
Use GIT_MAX_HEXSZ instead of GIT_SHA1_HEXSZ for an allocation so that it
is sufficiently large.  Switch a comparison to use the_hash_algo to
determine the length of a hex object ID.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
ab47df2d9a builtin/merge: switch to use the_hash_algo
Switch uses of GIT_SHA1_HEXSZ to use the_hash_algo instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
5188eb5d8e builtin/fmt-merge-msg: make hash independent
Convert several uses of GIT_SHA1_HEXSZ into references to the_hash_algo.
Switch other uses into a use of parse_oid_hex and uses of its computed
pointer.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
fe04ccf7ca builtin/update-index: simplify parsing of cacheinfo
Switch from using get_oid_hex to parse_oid_hex to simplify pointer
operations and avoid the need for a hash-related constant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
1928c9449e builtin/update-index: convert to using the_hash_algo
Switch from using GIT_SHA1_HEXSZ to the_hash_algo to make the parsing of
the index information hash independent.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
2ae2e2a1ca refs/files-backend: use the_hash_algo for writing refs
In order to ensure we write the correct amount, use the_hash_algo to
find the correct number of bytes for the current hash.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
7b38efad5e sha1-name: use the_hash_algo when parsing object names
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
4b048c917f strbuf: allocate space with GIT_MAX_HEXSZ
In order to be sure we have enough space to use with any hash algorithm,
use GIT_MAX_HEXSZ to allocate space.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
2770ccbdb2 commit: express tree entry constants in terms of the_hash_algo
Specify these constants in terms of the size of the hash algorithm
currently in use.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
d9cd734990 hex: switch to using the_hash_algo
Instead of using the GIT_SHA1_* constants, switch to using the_hash_algo
to convert object IDs to and from hex format.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
83e4b7571c tree-walk: replace hard-coded constants with the_hash_algo
Remove the hard-coded 20-based values and replace them with uses of
the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:39 -07:00
brian m. carlson
509f6f62a4 cache: update object ID functions for the_hash_algo
Most of our code has been converted to use struct object_id for object
IDs.  However, there are some places that still have not, and there are
a variety of places that compare equivalently sized hashes that are not
object IDs.  All of these hashes are artifacts of the internal hash
algorithm in use, and when we switch to NewHash for object storage, all
of these uses will also switch.

Update the hashcpy, hashclr, and hashcmp functions to use the_hash_algo,
since they are used in a variety of places to copy and manipulate
buffers that need to move data into or out of struct object_id.  This
has the effect of making the corresponding oid* functions use
the_hash_algo as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:27:38 -07:00
Jeff King
022d2ac1f3 blame: prefer xsnprintf to strcpy for colors
Our color buffers are all COLOR_MAXLEN, which fits the
largest possible color. So we can never overflow the buffer
by copying an existing color. However, using strcpy() makes
it harder to audit the code-base for calls that _are_
problems. We should use something like xsnprintf(), which
shows the reader that we expect this never to fail (and
provides a run-time assertion if it does, just in case).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 13:59:39 -07:00
Jeff Hostetler
75459410ed json_writer: new routines to create JSON data
Add "struct json_writer" and a series of jw_ routines to compose JSON
data into a string buffer.  The resulting string may then be printed by
commands wanting to support a JSON-like output format.

The json_writer is limited to correctly formatting structured data for
output.  It does not attempt to build an object model of the JSON data.

We say "JSON-like" because we do not enforce the Unicode (usually UTF-8)
requirement on string fields.  Internally, Git does not necessarily have
Unicode/UTF-8 data for most fields, so it is currently unclear the best
way to enforce that requirement.  For example, on Linux pathnames can
contain arbitrary 8-bit character data, so a command like "status" would
not know how to encode the reported pathnames.  We may want to revisit
this (or double encode such strings) in the future.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Wink Saville <wink@saville.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 13:55:39 -07:00
Jonathan Tan
8c4cc32689 tag: don't warn if target is missing but promised
deref_tag() prints a warning if the object that a tag refers to does not
exist. However, when a partial clone has an annotated tag from its
promisor remote, but not the object that it refers to, printing a
warning on such a tag is incorrect.

This occurs, for example, when the checkout that happens after a partial
clone causes some objects to be fetched - and as part of the fetch, all
local refs are read. The test included in this patch demonstrates this
situation.

Therefore, do not print a warning in this case.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 12:56:14 -07:00
Jonathan Tan
dc0a13f681 revision: tolerate promised targets of tags
In handle_commit(), it is fatal for an annotated tag to point to a
non-existent object. --exclude-promisor-objects should relax this rule
and allow non-existent objects that are promisor objects, but this is
not the case. Update handle_commit() to tolerate this situation.

This was observed when cloning from a repository with an annotated tag
pointing to a blob. The test included in this patch demonstrates this
case.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 12:56:14 -07:00
brian m. carlson
ab5e67d751 sequencer: pass absolute GIT_WORK_TREE to exec commands
The sequencer currently passes GIT_DIR, but not GIT_WORK_TREE, to exec
commands.  In that configuration, we assume that whatever directory
we're in is the top level of the work tree, and git rev-parse
--show-toplevel responds accordingly.  However, when we're in a
subdirectory, that isn't correct: we respond with the subdirectory as
the top level, resulting in unexpected behavior.

Ensure that we pass GIT_WORK_TREE as well as GIT_DIR so that git
operations within subdirectories work correctly.

Note that we are guaranteed to have a work tree in this case: the
relevant sequencer functions are called only from revert, cherry-pick,
and rebase--helper; all of these commands require a working tree.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 11:16:45 -07:00
Aaron Schrab
02cfd14920 sequencer: use configured comment character
Use the configured comment character when generating comments about
branches in a todo list.  Failure to honor this configuration causes a
failure to parse the resulting todo list.

Setting core.commentChar to "auto" will not be honored here, and the
previously configured or default value will be used instead. But, since
the todo list will consist of only generated content, there should not
be any non-comment lines beginning with that character.

Signed-off-by: Aaron Schrab <aaron@schrab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 11:04:56 -07:00
Jeff King
64eb14d310 fsck: downgrade gitmodulesParse default to "info"
We added an fsck check in ed8b10f631 (fsck: check
.gitmodules content, 2018-05-02) as a defense against the
vulnerability from 0383bbb901 (submodule-config: verify
submodule names as paths, 2018-04-30). With the idea that
up-to-date hosting sites could protect downstream unpatched
clients that fetch from them.

As part of that defense, we reject any ".gitmodules" entry
that is not syntactically valid. The theory is that if we
cannot even parse the file, we cannot accurately check it
for vulnerabilities. And anybody with a broken .gitmodules
file would eventually want to know anyway.

But there are a few reasons this is a bad tradeoff in
practice:

 - for this particular vulnerability, the client has to be
   able to parse the file. So you cannot sneak an attack
   through using a broken file, assuming the config parsers
   for the process running fsck and the eventual victim are
   functionally equivalent.

 - a broken .gitmodules file is not necessarily a problem.
   Our fsck check detects .gitmodules in _any_ tree, not
   just at the root. And the presence of a .gitmodules file
   does not necessarily mean it will be used; you'd have to
   also have gitlinks in the tree. The cgit repository, for
   example, has a file named .gitmodules from a
   pre-submodule attempt at sharing code, but does not
   actually have any gitlinks.

 - when the fsck check is used to reject a push, it's often
   hard to work around. The pusher may not have full control
   over the destination repository (e.g., if it's on a
   hosting server, they may need to contact the hosting
   site's support). And the broken .gitmodules may be too
   far back in history for rewriting to be feasible (again,
   this is an issue for cgit).

So we're being unnecessarily restrictive without actually
improving the security in a meaningful way. It would be more
convenient to downgrade this check to "info", which means
we'd still comment on it, but not reject a push. Site admins
can already do this via config, but we should ship sensible
defaults.

There are a few counterpoints to consider in favor of
keeping the check as an error:

 - the first point above assumes that the config parsers for
   the victim and the fsck process are equivalent. This is
   pretty true now, but as time goes on will become less so.
   Hosting sites are likely to upgrade their version of Git,
   whereas vulnerable clients will be stagnant (if they did
   upgrade, they'd cease to be vulnerable!). So in theory we
   may see drift over time between what two config parsers
   will accept.

   In practice, this is probably OK. The config format is
   pretty established at this point and shouldn't change a
   lot. And the farther we get from the announcement of the
   vulnerability, the less interesting this extra layer of
   protection becomes. I.e., it was _most_ valuable on day
   0, when everybody's client was still vulnerable and
   hosting sites could protect people. But as time goes on
   and people upgrade, the population of vulnerable clients
   becomes smaller and smaller.

 - In theory this could protect us from other
   vulnerabilities in the future. E.g., .gitmodules are the
   only way for a malicious repository to feed data to the
   config parser, so this check could similarly protect
   clients from a future (to-be-found) bug there.

   But that's trading a hypothetical case for real-world
   pain today. If we do find such a bug, the hosting site
   would need to be updated to fix it, too. At which point
   we could figure out whether it's possible to detect
   _just_ the malicious case without hurting existing
   broken-but-not-evil cases.

 - Until recently, we hadn't made any restrictions on
   .gitmodules content. So now in tightening that we're
   hitting cases where certain things used to work, but
   don't anymore. There's some moderate pain now. But as
   time goes on, we'll see more (and more varied) cases that
   will make tightening harder in the future. So there's
   some argument for putting rules in place _now_, before
   users grow more cases that violate them.

   Again, this is trading pain now for hypothetical benefit
   in the future. And if we try hard in the future to keep
   our tightening to a minimum (i.e., rejecting true
   maliciousness without hurting broken-but-not-evil repos),
   then that reduces even the hypothetical benefit.

Considering both sets of arguments, it makes sense to loosen
this check for now.

Note that we have to tweak the test in t7415 since fsck will
no longer consider this a fatal error. But we still check
that it reports the warning, and that we don't get the
spurious error from the config code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 10:57:23 -07:00
Jeff King
0d68764d94 fsck: split ".gitmodules too large" error from parse failure
Since ed8b10f631 (fsck: check .gitmodules content,
2018-05-02), we'll report a gitmodulesParse error for two
conditions:

  - a .gitmodules entry is not syntactically valid

  - a .gitmodules entry is larger than core.bigFileThreshold

with the intent that we can detect malicious files and
protect downstream clients. E.g., from the issue in
0383bbb901 (submodule-config: verify submodule names as
paths, 2018-04-30).

But these conditions are actually quite different with
respect to that bug:

 - a syntactically invalid file cannot trigger the problem,
   as the victim would barf before hitting the problematic
   code

 - a too-big .gitmodules _can_ trigger the problem. Even
   though it is obviously silly to have a 500MB .gitmodules
   file, the submodule code will happily parse it if you
   have enough memory.

So it may be reasonable to configure their severity
separately. Let's add a new class for the "too large" case
to allow that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 10:57:22 -07:00
Derrick Stolee
b18ef13a3f coccinelle: update commit.cocci
A recent patch series renamed the get_commit_tree_from_graph method but
forgot to update the coccinelle script that exempted it from rules
regarding accesses to 'maybe_tree'. This fixes that oversight to bring
the coccinelle scripts back to a good state.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 10:54:31 -07:00