Partially revert the support for multiple hash functions to regain
hash comparison performance; we'd think of a way to do this better
in the next cycle.
* jk/hashcmp-optim-for-2.19:
hashcmp: assert constant hash size
Test fixes.
* sg/test-must-be-empty:
tests: use 'test_must_be_empty' instead of 'test_cmp <empty> <out>'
tests: use 'test_must_be_empty' instead of 'test_cmp /dev/null <out>'
tests: use 'test_must_be_empty' instead of 'test ! -s'
tests: use 'test_must_be_empty' instead of '! test -s'
"git cmd -h" updates.
* rs/opt-updates:
parseopt: group literal string alternatives in argument help
remote: improve argument help for add --mirror
checkout-index: improve argument help for --stage
"git help --config" (which is used in command line completion)
missed the configuration variables not described in the main
config.txt file but are described in another file that is included
by it, which has been corrected.
* nd/complete-config-vars:
generate-cmdlist.sh: collect config from all config.txt files
"git branch --list" learned to take the default sort order from the
'branch.sort' configuration variable, just like "git tag --list"
pays attention to 'tag.sort'.
* sm/branch-sort-config:
branch: support configuring --sort via .gitconfig
The meaning of the possible values the "core.checkStat"
configuration variable can take were not adequately documented,
which has been fixed.
* nd/config-core-checkstat-doc:
config.txt: clarify core.checkStat
The --file option to grep isn't in POSIX[1], but -f is[1]. Let's check
for that in the future, and fix the portability regression in
f237c8b6fe ("commit-graph: implement git-commit-graph write",
2018-04-02) that broke e.g. AIX.
1. http://pubs.opengroup.org/onlinepubs/009695399/utilities/grep.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test guarded by PERLJSON added in 75459410ed ("json_writer: new
routines to create JSON data", 2018-07-13) assumed that a JSON boolean
value like "true" or "false" would be represented as "1" or "0" in
Perl.
This behavior can't be relied upon, e.g. with JSON.pm 2.50 and
JSON::PP. A JSON::PP::Boolean object will be represented as "true"
or "false". To work around this let's check if we have any refs left
after we check for hashes and arrays, assume those are JSON objects,
and coerce them to a known boolean value.
The behavior of this test still looks odd to me. Why implement our own
ad-hoc encoder just for some one-off test, as opposed to say Perl's
own Data::Dumper with Sortkeys et al? But with this change it works,
so let's leave it be.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the portability of chainlint by using shorter labels. On
AIX sed will complain about:
sed: 0602-417 The label :hereslurp is greater than eight
characters
This, in combination with the previous fix to this file makes
GIT_TEST_CHAIN_LINT=1 (which is the default) working again on AIX
without issues, and the "gmake check-chainlint" test also passes.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
275267937b (range-diff: make dual-color the default mode, 2018-08-13)
replaced --dual-color with --no-dual-color but left the option's
summary untouched. Rewrite the summary to describe --no-dual-color
rather than dual-color.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix an incorrect comment in the new code added in b4da37380b
(unpack-trees: optimize walking same trees with cache-tree -
2018-08-18) and document about the new test variable that is enabled
by default in test-lib.sh in 4592e6080f (cache-tree: verify valid
cache-tree in the test suite - 2018-08-18)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a comment in chainlint.sed to appease AIX sed, which would
previously print this error:
sed: # stash for later printing is not a recognized function
1. https://public-inbox.org/git/CAPig+cTTbU5HFMKgNyrxTp3+kcK46-Fn=4ZH6zDt1oQChAc3KA@mail.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The seq command is not in POSIX, and doesn't exist on
e.g. OpenBSD. We've had the test_seq wrapper since d17cf5f3a3 ("tests:
Introduce test_seq", 2012-08-04), but use of it keeps coming back,
e.g. in the recently added "fetch negotiator" tests being added here.
So let's also add a check to "make test-lint". The regex is aiming to
capture the likes of $(seq ..) and "seq" as a stand-alone command,
without capturing some existing cases where we e.g. have files called
"seq", as \bseq\b would do.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "head -c BYTES" option is non-portable (not in POSIX[1]). Change
such invocations to use the test_copy_bytes wrapper added in
48860819e8 ("t9300: factor out portable "head -c" replacement",
2016-06-30).
This fixes a test added in 9d2e330b17 ("ewah_read_mmap: bounds-check
mmap reads", 2018-06-14), which has been breaking
t5310-pack-bitmaps.sh on OpenBSD since 2.18.0. The OpenBSD ports
already have a similar workaround after their upgrade to 2.18.0[2].
I have not tested this on IRIX, but according to 4de0bbd898 ("t9300:
use perl "head -c" clone in place of "dd bs=1 count=16000" kluge",
2010-12-13) this invocation would have broken things there too.
Also, change a valgrind-specific codepath in test-lib.sh to use this
wrapper. Given where valgrind runs I don't think this would ever
become a portability issue in practice, but it's easier to just use
the wrapper than introduce some exception for the "make test-lint"
check being added here.
1. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/head.html
2. 08d5d82eae (diff-f7d3c4fabeed1691620d608f1534f5e5)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix typos and convert a question which does not expect to be replied
to a simple advice.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.commitGraph config setting was accidentally removed from
the config documentation. In that same patch, the config setting
that writes a commit-graph during garbage collection was incorrectly
written to the doc as "gc.commitGraph" instead of "gc.writeCommitGraph".
Reported-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The append_signoff() function takes an "int" to specify the
number of bytes to ignore. Most callers just pass 0, and the
remainder use ignore_non_trailer() to skip over cruft.
That function also returns an int, and uses them internally.
On systems where size_t is larger than an int (i.e., most
64-bit systems), dealing with a ridiculously large commit
message could end up overflowing an int, producing
surprising results (e.g., returning a negative offset, which
would cause us to look outside the original string).
Let's consistently use size_t for these offsets through this
whole stack. As a bonus, this makes the meaning of
"ignore_footer" as an offset (and not a boolean) more clear.
But while we're here, let's also document the interface.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the sequencer code appends a signoff or cherry-pick
origin, it uses the default trailer-parsing options, which
treat "---" as the end of the commit message. As a result,
it may be fooled by a commit message that contains that
string and fail to find the existing trailer block. Even
more confusing, the actual append code does not know about
"---", and always appends to the end of the string. This can
lead to bizarre results. E.g., appending a signoff to a
commit message like this:
subject
body
---
these dashes confuse the parser!
Signed-off-by: A
results in output with a final block like:
Signed-off-by: A
Signed-off-by: A
The parser thinks the final line of the message is "body",
and ignores everything else, claiming there are no trailers.
So we output an extra newline separator (wrong) and add a
duplicate signoff (also wrong).
Since we know we are feeding a pure commit message, we can
simply tell the parser to ignore the "---" divider.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In both of these cases we know that we are feeding the
trailer-parsing code a pure commit message. We should tell
it so, which avoids false positives for a commit message
that contains a "---" line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even with the newly-tightened "---" parser, it's still
possible for a commit message to trigger a false positive if
it contains something like "--- foo". If the caller knows
that it has only a single commit message, it can now tell us
with the "--no-divider" option, eliminating any false
positives.
If we were designing this from scratch, I'd probably make
this the default. But we've advertised the "---" behavior in
the documentation since interpret-trailers has existed.
Since it's meant to be scripted, breaking that would be a
bad idea.
Note that the logic is in the underlying trailer.c code,
which is used elsewhere. The default there will keep the
current behavior, but many callers will benefit from setting
this new option. That's left for future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interpret-trailers command accepts not only raw commit
messages, but it also can manipulate trailers in
format-patch output. That means it must find the "---"
boundary separating the commit message from the patch.
However, it does so by looking for any line starting with
"---", regardless of whether there is further content.
This is overly lax compared to the parsing done in
mailinfo.c's patchbreak(), and may cause false positives
(e.g., t/perf output tables uses dashes; if you cut and
paste them into your commit message, it fools the parser).
We could try to reuse patchbreak() here, but it actually has
several heuristics that are not of interest to us (e.g.,
matching "diff -" without a three-dash separator or even a
CVS "Index:" line). We're not interested in taking in
whatever random cruft people may send, but rather handling
git-formatted patches.
Note that the existing documentation was written in a loose
way, so technically we are changing the behavior from what
it said. But this should implement the original intent in a
more accurate way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the trailer code has an "opts" struct which is
filled in by the caller. We don't pass it down to
trailer_info_get(), which does the initial parsing, because
there hasn't yet been a need to do so.
Let's start passing it down in preparation for adding new
options. Note that there's a single caller which doesn't
otherwise have such an options struct. Since it's just one
caller (that we'd have to modify anyway), let's not bother
with any special treatment like accepting a NULL options
struct, and just have it allocate one with the defaults.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We store the length of the trailers list in a size_t. So on
a 64-bit system with a 32-bit int, in the unlikely case that
we manage to actually allocate a list with 2^31 entries,
we'd loop forever trying to iterate over it (our "int" would
wrap to negative before exceeding info->trailer_nr).
This probably doesn't matter in practice. Each entry is at
least a pointer plus a non-empty string, so even without
malloc overhead or the memory to hold the original string
we're parsing from, you'd need to allocate tens of
gigabytes. But it's easy enough to do it right.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many of the string-parsing functions inside trailer.c return
integer offsets into the string (e.g., to point to the end
of the trailer block). Several of these use an "int" to
return or store the offsets. On a system where "size_t" is
much larger than "int" (e.g., most 64-bit ones), it's easy
to feed a gigantic commit message that results in a negative
offset. This can result in us reading memory before the
string (if the int is used as an index) or far after (if
it's implicitly cast to a size_t by passing to a strbuf
function).
Let's fix this by using size_t for all string offsets. Note
that several of the functions need ssize_t, since they use
"-1" as a sentinel value. The interactions here can be
pretty subtle. E.g., end_of_title in find_trailer_start()
does not itself need to be signed, but it is compared to the
result of last_line(), which is. That promotes the latter to
unsigned, and the ">=" does not behave as you might expect.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The verbose output of the test 'reword without issues functions as
intended' in 't3423-rebase-reword.sh', added in a9279c6785 (sequencer:
do not squash 'reword' commits when we hit conflicts, 2018-06-19),
contains the following error output:
sed: -e expression #1, char 2: extra characters after command
This error comes from within the 'fake-editor.sh' script created by
'lib-rebase.sh's set_fake_editor() function, and the root cause is the
FAKE_LINES="pick 1 reword 2" variable in the test in question, in
particular the "pick" word. 'fake-editor.sh' assumes 'pick' to be the
default rebase command and doesn't support an explicit 'pick' command
in FAKE_LINES. As a result, 'pick' will be used instead of a line
number when assembling the following 'sed' script:
sed -n picks/^pick/pick/p
which triggers the aforementioned error.
Luckily, this didn't affect the test's correctness: the erroring 'sed'
command doesn't write anything to the todo script, and processing the
rest of FAKE_LINES generates the desired todo script, as if that
'pick' command were not there at all.
The minimal fix would be to remove the 'pick' word from FAKE_LINES,
but that would leave us susceptible to similar issues in the future.
Instead, teach the fake-editor script to recognize an explicit 'pick'
command, which is still a fairly trivial change.
In the future we might want to consider reinforcing this fake editor
script with an &&-chain and stricter parsing of the FAKE_LINES
variable (e.g. to error out when encountering unknown rebase commands
or commands and line numbers in the wrong order).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to 509f6f62a4 (cache: update object ID functions for
the_hash_algo, 2018-07-16), hashcmp() called memcmp() with a
constant size of 20 bytes. Some compilers were able to turn
that into a few quad-word comparisons, which is faster than
actually calling memcmp().
In 509f6f62a4, we started using the_hash_algo->rawsz
instead. Even though this will always be 20, the compiler
doesn't know that while inlining hashcmp() and ends up just
generating a call to memcmp().
Eventually we'll have to deal with multiple hash sizes, but
for the upcoming v2.19, we can restore some of the original
performance by asserting on the size. That gives the
compiler enough information to know that the memcmp will
always be called with a length of 20, and it performs the
same optimization.
Here are numbers for p0001.2 run against linux.git on a few
versions. This is using -O2 with gcc 8.2.0.
Test v2.18.0 v2.19.0-rc0 HEAD
------------------------------------------------------------------------------
0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0%
You can see that v2.19 is a little slower than v2.18. This
commit ended up slightly faster than v2.18, but there's a
fair bit of run-to-run noise (the generated code in the two
cases is basically the same). This patch does seem to be
consistently 1-2% faster than v2.19.
I tried changing hashcpy(), which was also touched by
509f6f62a4, in the same way, but couldn't measure any
speedup. Which makes sense, at least for this workload. A
traversal of the whole commit graph requires looking up
every entry of every tree via lookup_object(). That's many
multiples of the numbers of objects in the repository (most
of the lookups just return "yes, we already saw that
object").
[jn: verified using "make object.s" that the memcmp call goes away.]
Reported-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff King <peff@peff.net
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we originally did the series that contains 7ba826290a
(revision: add rev_input_given flag, 2017-08-02) the intent
was that "git rev-list --stdin </dev/null" would similarly
become a successful noop. However, an attempt at the time to
do that did not work[1]. The problem is that rev_input_given
serves two roles:
- it tells rev-list.c that it should not error out
- it tells revision.c that it should not have the "default"
ref kick (e.g., "HEAD" in "git log")
We want to trigger the former, but not the latter. This is
technically possible with a single flag, if we set the flag
only after revision.c's revs->def check. But this introduces
a rather subtle ordering dependency.
Instead, let's keep two flags: one to denote when we got
actual input (which triggers both roles) and one for when we
read stdin (which triggers only the first).
This does mean a caller interested in the first role has to
check both flags, but there's only one such caller. And any
future callers might want to make the distinction anyway
(e.g., if they care less about erroring out, and more about
whether revision.c soaked up our stdin).
In fact, we already keep such a flag internally in
revision.c for this purpose, so this is really just exposing
that to the caller (and the old function-local flag can go
away in favor of our new one).
[1] https://public-inbox.org/git/20170802223416.gwiezhbuxbdmbjzx@sigill.intra.peff.net/
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several tests in 't3420-rebase-autostash.sh' start various rebase
processes that are expected to fail because of merge conflicts. These
tests then run '! grep' to ensure that the autostash feature did its
job, and the dirty contents of a file is gone. However, due to the
test repo's history and the choice of upstream branch that file
shouldn't exist in the conflicted state at all. Consequently, this
'grep' doesn't fail as expected, because it can't find the dirty
content, but it fails because it can't open the file.
Tighten this check by using 'test_path_is_missing' instead, thereby
avoiding unexpected errors from 'grep' as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test 'store updates stash ref and reflog' in 't3903-stash.sh'
creates a stash from a new file, runs 'git reset --hard' to throw away
any modifications to the work tree, and then runs '! grep' to ensure
that the staged contents are gone. Since the file didn't exist
before, it shouldn't exist after 'git reset' either. Consequently,
this 'grep' doesn't fail as expected, because it can't find the staged
content, but it fails because it can't open the file.
Tighten this check by using 'test_path_is_missing' instead, thereby
avoiding an unexpected error from 'grep' as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a recent update in 2.18 era, "git pack-objects" started
producing a larger than necessary packfiles by missing
opportunities to use large deltas.
* nd/pack-deltify-regression-fix:
pack-objects: fix performance issues on packing large deltas
Prior to d3c6751b18 (tests: make use of the test_must_be_empty
function, 2018-07-27), in the test 'rev-list should succeed with empty
output on empty stdin' in 't6018-rev-list-glob' the empty 'expect'
file served dual purpose: besides specifying the expected output, as
usual, it also served as empty input for 'git rev-list --stdin'.
Then d3c6751b18 came along, and, as part of the conversion to
'test_must_be_empty', removed this empty 'expect' file, not realizing
its secondary purpose. Redirecting stdin from the now non-existing
file failed the test, but since this test expects failure in the first
place, this issue went unnoticed.
Redirect 'git rev-list's stdin explicitly from /dev/null to provide
empty input. (Strictly speaking we don't need this redirection,
because the test script's stdin is already redirected from /dev/null
anyway, but I think it's better to be explicit about it.)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test ' context does not include preceding empty lines' in the
block of tests 'change with long common tail and no context' in
't4051-diff-function-context.sh' tries to read the file
'long_common_tail.diff.diff', but that file doesn't exist as its name
contains one more '.diff' suffixes than necessary.
Despite this error the test still succeeded without checking what it's
supposed to, because this erroneous read is done on the line:
test "$(first_context_line <long_common_tail.diff.diff)" != " "
which means that:
- the command substitution hides the error, so it won't fail the
test, and
- the result of the command substitution is the empty string, which
is, of course, not equal to a single space character, so the
condition is fulfilled, and the test succeeds.
As a minimal fix, fix the name of the file to be read.
In the future we might want to reorganize this test script (1) to use
'test_cmp' instead of 'test's and command substitutions to catch
failing commands and to provide helpful error messages, and (2) to
specify what the expected result actually _is_ instead of what it
isn't.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the test 'checkout with autocrlf=input' in 't0020-crlf.sh', one of
the 'has_cr' checks looks at the non-existing file 'two' instead of
'dir/two'. The test still succeeds, without actually checking what it
was supposed to, because this check is expected to fail anyway.
As a minimal fix, fix the name of the file to be checked.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test '--dry-run with conflicts fixed from a merge' in
't7501-commit.sh', added in 8dc874b2ee (wt-status.c: set commitable
bit if there is a meaningful merge., 2016-02-15), runs the following
unnecessary and downright bogus command substitution:
! $(git merge --no-commit commit-1) &&
I.e. after 'git merge ...' is executed and expectedly fails, the test
attempts to execute its output:
Merging:
80f2ea2 commit 2
virtual commit-1
found 1 common ancestor:
e60d113 Initial commit
Auto-merging test-file
CONFLICT (content): Merge conflict in test-file
Automatic merge failed; fix conflicts and then commit the result.
as a command, which most likely fails, because there is no such
command as "Merging:". Then '!' negates the failed exit status, the
test continues, and eventually succeeds.
Remove this command substitution and use 'test_must_fail' to ensure
that 'git merge' fails.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The author_date_slab is used to store the author date of a commit
when walking with the --author-date flag in rev-list or log. This
was added as an 'unsigned long' in
81c6b38b "log: --author-date-order"
Since 'unsigned long' is ambiguous in its bit-ness across platforms
(64-bit in Linux, 32-bit in Windows, for example), most references
to the author dates in commit.c were converted to timestamp_t in
dddbad72 "timestamp_t: a new data type for timestamps"
However, the slab definition was missed, leading to a mismatch in
the data types in Windows. This would not reveal itself as a bug
unless someone authors a commit after February 2106, but commits
can store anything as their author date.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>