Commit Graph

54794 Commits

Author SHA1 Message Date
Johannes Schindelin
5f7864663b README: add a build badge (status of the Azure Pipelines build)
Just like so many other OSS projects, we now also have a build badge.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:47 -08:00
Johannes Schindelin
72d63b2f45 mingw: be more generous when wrapping up the setitimer() emulation
Every once in a while, the Azure Pipeline fails with some semi-random

	error: timer thread did not terminate timely

This error message means that the thread that is used to emulate the
setitimer() function did not terminate within 1,000 milliseconds.

The most likely explanation (and therefore the one we should assume to
be true, according to Occam's Razor) is that the timeout of one second
is simply not enough because we try to run so many tasks in parallel.

So let's give it ten seconds instead of only one. That should be enough.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:46 -08:00
Johannes Schindelin
6c1f4ae65a ci: use git-sdk-64-minimal build artifact
Instead of a shallow fetch followed by a sparse checkout, we are
better off by using a separate, dedicated Pipeline that bundles
the SDK as a build artifact, and then consuming that build artifact
here.

In fact, since this artifact will be used a lot, we spent substantial
time on figuring out a minimal subset of the Git for Windows SDK, just
enough to build and test Git. The result is a size reduction from around
1GB (compressed) to around 55MB (compressed). This also comes with the
change where we now call `usr\bin\bash.exe` directly, as `git-cmd.exe`
is not included in the minimal SDK.

That reduces the time to initialize Git for Windows' SDK from anywhere
between 2m30s-7m to a little over 1m.

Note: in theory, we could also use the DownloadBuildArtifacts@0 task
here. However, restricted permissions that are in effect when building
from forks would let this fail for PR builds, defeating the whole
purpose of the Azure Pipelines support for git.git.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:46 -08:00
Johannes Schindelin
2e90484eb4 ci: add a Windows job to the Azure Pipelines definition
Previously, we did not have robust support for Windows in our CI
definition, simply because Travis cannot accommodate our needs (even
after Travis added experimental Windows support very recently, it takes
longer than Travis' 50 minute timeout to build Git and run the test
suite on Windows). Instead, we used a hack that started a dedicated
Azure Pipeline from Travis and waited for the output, often timing out
(which is quite fragile, as we found out).

With this commit, we finally have first-class support for Windows in our
CI definition (in the Azure Pipelines one, that is).

Due to our reliance on Unix shell scripting in the test suite, combined
with the challenges on executing such scripts on Windows, the Windows
job currently takes a whopping ~1h20m to complete. Which is *far* longer
than the next-longest job takes (linux-gcc, ~35m).

Now, Azure Pipelines's free tier for open source projects (such as Git)
offers up to 10 concurrent jobs for free, meaning that the overall run
time will be dominated by the slowest job(s).

Therefore, it makes sense to start the Windows job first, to minimize
the time the entire build takes from start to end (which is now pretty
safely the run time of the Windows job).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:46 -08:00
Johannes Schindelin
27be78173d Add a build definition for Azure DevOps
This commit adds an azure-pipelines.yml file which is Azure DevOps'
equivalent to Travis CI's .travis.yml.

The main idea is to replicate the Travis configuration as faithfully as
possible, to make it easy to compare the Azure Pipeline builds to the
Travis ones (spoiler: some parts, especially the macOS jobs, are way
faster in Azure Pileines). Meaning: the number and the order of the jobs
added in this commit faithfully replicates what we have in .travis.yml.

Note: Our .travis.yml configuration has a Windows part that is *not*
replicated in the Azure Pipelines definition. The reason is easy to see:
As Travis cannot support our Windws needs (even with the preliminary
Windows support that was recently added to Travis after waiting for
*years* for that feature, our test suite would simply hit Travis'
timeout every single time).

To make things a bit easier to understand, we refrain from using the
`matrix` feature here because (while it is powerful) it can be a bit
confusing to users who are not familiar with CI setups. Therefore, we
use a separate phase even for similar configurations (such as GCC vs
Clang on Linux, GCC vs Clang on macOS).

Also, we make use of the shiny new feature we just introduced where the
test suite can output JUnit-style .xml files. This information is made
available in a nice UI that allows the viewer to filter by phase and/or
test number, and to see trends such as: number of (failing) tests, time
spent running the test suite, etc. (While this seemingly contradicts the
intention to replicate the Travis configuration as faithfully as
possible, it is just too nice to show off that capability here already.)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:46 -08:00
Johannes Schindelin
6141a2edc9 ci/lib.sh: add support for Azure Pipelines
This patch introduces a conditional arm that defines some environment
variables and a function that displays the URL given the job id (to
identify previous runs for known-good trees).

Because Azure Pipeline's macOS agents already have git-lfs and gettext
installed, we can leave `BREW_INSTALL_PACKAGES` empty (unlike in
Travis' case).

Note: this patch does not introduce an Azure Pipelines definition yet;
That is left for the next patch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:46 -08:00
Johannes Schindelin
2223190815 tests: optionally write results as JUnit-style .xml
This will come in handy when publishing the results of Git's test suite
during an automated Azure DevOps run.

Note: we need to make extra sure that invalid UTF-8 encoding is turned
into valid UTF-8 (using the Replacement Character, \uFFFD) because
t9902's trace contains such invalid byte sequences, and the task in the
Azure Pipeline that uploads the test results would refuse to do anything
if it was asked to parse an .xml file with invalid UTF-8 in it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:46 -08:00
Jeff King
c9446f0504 docs/config: clarify "text property" in core.eol
The word "property" is vague here. Let's spell out that we mean the path
must be marked with the text attribute.

While we're here, let's make the paragraph a little easier to read by
de-emphasizing the "when core.autocrlf is false" bit. Putting it in the
first sentence obscures the main content, and many readers won't care
about autocrlf (i.e., anyone who is just following the gitattributes(7)
advice, which mainly discusses "text" and "core.eol").

Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:21:33 -08:00
Jeff King
2b6808531a doc/gitattributes: clarify "autocrlf overrides eol"
We only override core.eol with core.autocrlf when the latter is set to
something besides "false".  Let's make this more clear, and point the
reader to the git-config definitions, which discuss this in more detail.

Noticed-by: Sergey Lukashev <lukashev.s@ya.ru>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:21:33 -08:00
Torsten Bögershausen
e62e225ffb test-lint: only use only sed [-n] [-e command] [-f command_file]
From `man sed` (on a Mac OS X box):
The -E, -a and -i options are non-standard FreeBSD extensions and may not be available
on other operating systems.

From `man sed` on a Linux box:
REGULAR EXPRESSIONS
       POSIX.2 BREs should be supported, but they aren't completely because of
       performance problems.  The \n sequence in a regular expression matches the newline
       character,  and  similarly  for \a, \t, and other sequences.
       The -E option switches to using extended regular expressions instead; the -E option
       has been supported for years by GNU sed, and is now included in POSIX.

Well, there are still a lot of systems out there, which don't support it.
Beside that, IEEE Std 1003.1TM-2017, see
http://pubs.opengroup.org/onlinepubs/9699919799/
does not mention -E either.

To be on the safe side, don't allow -E (or -r, which is GNU).
Change check-non-portable-shell.pl to only accept the portable options:
sed [-n] [-e command] [-f command_file]

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 11:25:16 -08:00
Patrick Hogg
edb673cf10 pack-objects: merge read_lock and lock in packing_data struct
Rename the packing_data lock to obd_lock and upgrade it to a recursive
mutex to make it suitable for current read_lock usages. Additionally
remove the superfluous #ifndef NO_PTHREADS guard around mutex
initialization in prepare_packing_data as the mutex functions
themselves are already protected.

Signed-off-by: Patrick Hogg <phogg@novamoon.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 11:22:12 -08:00
Patrick Hogg
459307b139 pack-objects: move read mutex to packing_data struct
ac77d0c37 ("pack-objects: shrink size field in struct object_entry",
2018-04-14) added an extra usage of read_lock/read_unlock in the newly
introduced oe_get_size_slow for thread safety in parallel calls to
try_delta(). Unfortunately oe_get_size_slow is also used in serial
code, some of which is called before the first invocation of
ll_find_deltas. As such the read mutex is not guaranteed to be
initialized.

Resolve this by moving the read mutex to packing_data and initializing
it in prepare_packing_data which is initialized in cmd_pack_objects.

Signed-off-by: Patrick Hogg <phogg@novamoon.net>
Reviewed-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 11:22:06 -08:00
Arti Zirk
2eb14bb2d4 git-instaweb: add Python builtin http.server support
With this patch it is possible to launch git-instaweb by using
Python http.server CGI handler via `-d python` option.

git-instaweb generates a small wrapper around the http.server
(in GIT_DIR/gitweb/) that address a limitation of the CGI handler
where CGI scripts have to be in a cgi-bin subdirectory and
directory index can't be easily changed. To keep the implementation
small, gitweb is running on url `/cgi-bin/gitweb.cgi` and an automatic
redirection is done when opening `/`.

The generated wrapper is compatible with both Python 2 and 3.

Python is by default installed on most modern Linux distributions
which enables running `git instaweb -d python` without needing
anything else.

Signed-off-by: Arti Zirk <arti.zirk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:57:44 -08:00
Johannes Schindelin
4419de9164 test-date: add a subcommand to measure times in shell scripts
In the next commit, we want to teach Git's test suite to optionally
output test results in JUnit-style .xml files. These files contain
information about the time spent. So we need a way to measure time.

While we could use `date +%s` for that, this will give us only seconds,
i.e. very coarse-grained timings.

GNU `date` supports `date +%s.%N` (i.e. nanosecond-precision output),
but there is no equivalent in BSD `date` (read: on macOS, we would not
be able to obtain precise timings).

So let's introduce `test-tool date getnanos`, with an optional start
time, that outputs preciser values. Note that this might not actually
give us nanosecond precision on some platforms, but it will give us as
precise information as possible, without the portability issues of shell
commands.

Granted, it is a bit pointless to try measuring times accurately in
shell scripts, certainly to nanosecond precision. But it is better than
second-granularity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:34:28 -08:00
Johannes Schindelin
4b060a4d97 ci: use a junction on Windows instead of a symlink
Symbolic links are still not quite as easy to use on Windows as on Linux
(for example, on versions older than Windows 10, only administrators can
create symlinks, and on Windows 10 you still need to be in developer
mode for regular users to have permission), but NTFS junctions can give
us a way out.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:34:28 -08:00
Johannes Schindelin
eaa62291ff ci: inherit --jobs via MAKEFLAGS in run-build-and-tests
Let's not decide in the generic ci/ part how many jobs to run in
parallel; different CI configurations would favor a different number of
parallel jobs, and it is easy enough to hand that information down via
the `MAKEFLAGS` variable.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:34:28 -08:00
Johannes Schindelin
b011fabd6e ci/lib.sh: encapsulate Travis-specific things
The upcoming patches will allow building git.git via Azure Pipelines
(i.e. Azure DevOps' Continuous Integration), where variable names and
URLs look a bit different than in Travis CI.

Also, the configurations of the available agents are different. For
example, Travis' and Azure Pipelines' macOS agents are set up
differently, so that on Travis, we have to install the git-lfs and
gettext Homebrew packages, and on Azure Pipelines we do not need to.
Likewise, Azure Pipelines' Ubuntu agents already have asciidoctor
installed.

Finally, on Azure Pipelines the natural way is not to base64-encode tar
files of the trash directories of failed tests, but to publish build
artifacts instead. Therefore, that code to log those base64-encoded tar
files is guarded to be Travis-specific.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:34:28 -08:00
Johannes Schindelin
c2160f2d19 ci: rename the library of common functions
The name is hard-coded to reflect that we use Travis CI for continuous
testing.

In the next commits, we will extend this to be able use Azure DevOps,
too.

So let's adjust the name to make it more generic.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:34:28 -08:00
Johannes Schindelin
4096a98d79 travis: fix skipping tagged releases
When building a PR, TRAVIS_BRANCH refers to the *target branch*.
Therefore, if a PR targets `master`, and `master` happened to be tagged,
we skipped the build by mistake.

Fix this by using TRAVIS_PULL_REQUEST_BRANCH (i.e. the *source branch*)
when available, falling back to TRAVIS_BRANCH (i.e. for CI builds, also
known as "push builds").

Let's give it a new variable name, too: CI_BRANCH (as it is different
from TRAVIS_BRANCH). This also prepares for the upcoming patches which
will make our ci/* code a bit more independent from Travis and open it
to other CI systems (in particular to Azure Pipelines).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:34:28 -08:00
Phillip Wood
891d4a0313 implicit interactive rebase: don't run sequence editor
If GIT_SEQUENCE_EDITOR is set then rebase runs it when executing
implicit interactive rebases which are supposed to appear
non-interactive to the user. Fix this by setting GIT_SEQUENCE_EDITOR=:
rather than GIT_EDITOR=:. A couple of tests relied on the old behavior
so they are updated to work with the new regime.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-28 10:25:39 -08:00
SZEDER Gábor
4468d4435c object_as_type: initialize commit-graph-related fields of 'struct commit'
When the commit graph and generation numbers were introduced in
commits 177722b344 (commit: integrate commit graph with commit
parsing, 2018-04-10) and 83073cc994 (commit: add generation number to
struct commit, 2018-04-25), they tried to make sure that the
corresponding 'graph_pos' and 'generation' fields of 'struct commit'
are initialized conservatively, as if the commit were not included in
the commit-graph file.

Alas, initializing those fields only in alloc_commit_node() missed the
case when an object that happens to be a commit is first looked up via
lookup_unknown_object(), and is then later converted to a 'struct
commit' via the object_as_type() helper function (either calling it
directly, or as part of a subsequent lookup_commit() call).
Consequently, both of those fields incorrectly remain set to zero,
which means e.g. that the commit is present in and is the first entry
of the commit-graph file.  This will result in wrong timestamp, parent
and root tree hashes, if such a 'struct commit' instance is later
filled from the commit-graph.

Extract the initialization of 'struct commit's fields from
alloc_commit_node() into a helper function, and call it from
object_as_type() as well, to make sure that it properly initializes
the two commit-graph-related fields, too.  With this helper function
it is hopefully less likely that any new fields added to 'struct
commit' in the future would remain uninitialized.

With this change alloc_commit_index() won't have any remaining callers
outside of 'alloc.c', so mark it as static.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-27 16:55:57 -08:00
SZEDER Gábor
28c23cd4c3 strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other
The best way to add one strbuf to an other is via:

  strbuf_addbuf(&sb, &sb2);

This is a bit more idiomatic and efficient than:

  strbuf_addstr(&sb, sb2.buf);

because the size of the second strbuf is known and thus it can spare a
strlen() call, and much more so than:

  strbuf_addf(&sb, "%s", sb2.buf);

because it can spare the whole vsnprintf() formatting magic.

Add new semantic patches to 'contrib/coccinelle/strbuf.cocci' to catch
these undesired patterns and to suggest strbuf_addbuf() instead.

Luckily, our codebase is already clean from any such undesired
patterns (but one of the in-flight topics just tried to sneak in such
a strbuf_addf() call).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-27 16:21:09 -08:00
Nguyễn Thái Ngọc Duy
dc40b24df4 fetch: prefer suffix substitution in compact fetch.output
I have a remote named "jch" and it has a branch with the same name. And
fetch.output is set to "compact". Fetching this remote looks like this

 From https://github.com/gitster/git
  + eb7fd39f6b...835363af2f jch                -> */jch  (forced update)
    6f11fd5edb..59b12ae96a  nd/config-move-to  -> jch/*
  * [new branch]            nd/diff-parseopt   -> jch/*
  * [new branch]            nd/the-index-final -> jch/*

Notice that the local side of branch jch starts with "*" instead of
ending with it like the rest. It's not exactly wrong. It just looks
weird.

This patch changes the find-and-replace code a bit to try finding prefix
first before falling back to strstr() which finds a substring from left
to right. Now we have something less OCD

 From https://github.com/gitster/git
  + eb7fd39f6b...835363af2f jch                -> jch/*  (forced update)
    6f11fd5edb..59b12ae96a  nd/config-move-to  -> jch/*
  * [new branch]            nd/diff-parseopt   -> jch/*
  * [new branch]            nd/the-index-final -> jch/*

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-27 16:20:34 -08:00
Jeff King
55ad152cfb convert: drop path parameter from actual conversion functions
The caller is responsible for looking up the attributes,
after which point we no longer care about the path at which
the content is found.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:45 -08:00
Jeff King
129beeee9a convert: drop len parameter from conversion checks
We've already extracted the useful information into our text_stat
struct, so the length is no longer needed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:45 -08:00
Jeff King
a263ea84d1 config: drop unused parameter from maybe_remove_section()
We don't need the contents buffer to drop a section; the parse
information in the config_store_data parameter is enough for our logic.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:44 -08:00
Jeff King
3d42034a18 show_date_relative(): drop unused "tz" parameter
The timestamp we receive is in epoch time, so there's no need for a
timezone parameter to interpret it. The matching show_date() uses "tz"
to show dates in author local time, but relative dates show only the
absolute time difference. The author's location is irrelevant, barring
relativistic effects from using Git close to the speed of light.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:44 -08:00
Jeff King
acbf33f846 column: drop unused "opts" parameter in item_length()
There are no column options which impact the length computation. In
theory there might be, but this is a file-local function, so it will be
trivial to re-add the parameter should it ever be useful.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:44 -08:00
Jeff King
fcb133e978 create_bundle(): drop unused "header" parameter
There's no need to pass a header struct to create_bundle(); it writes
the header information directly to a descriptor (and does not report
back details to the caller).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:44 -08:00
Jeff King
74f8a9c954 apply: drop unused "def" parameter from find_name_gnu()
We use the "def" parameter in find_name_common() for some heuristics,
but they are not necessary with the less-ambiguous GNU format. Let's
drop this unused parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:44 -08:00
Jeff King
667f71c36c match-trees: drop unused path parameter from score functions
The scores do not take the particular path into account at
all. It's possible they could, but these are all static file-local
functions. It won't be a big deal to re-add the parameter if they ever
need it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:35:44 -08:00
Jeff King
dac03b5518 combine-diff: treat --dirstat like --stat
Currently "--cc --dirstat" will show nothing for a merge.  Like
--shortstat and --summary in the previous two patches, it probably makes
sense to treat it like we do --stat, and show a stat against the
first-parent.

This case is less obviously correct than for --shortstat and --summary,
as those are basically variants of --stat themselves. It's possible we
could develop a multi-parent combined dirstat format, in which case we
might regret defining this first-parent behavior. But the same could be
said for --stat, and in the 12+ years of it showing first-parent stats,
nobody has complained.

So showing the first-parent dirstat is at least _useful_, and if we
later develop a clever multi-parent stat format, we'd probably have to
deal with --stat anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:18:53 -08:00
Jeff King
04b19fcafd combine-diff: treat --summary like --stat
Currently "--cc --summary" on a merge shows nothing. Since we show "--cc
--stat" as a stat against the first parent, and because --summary is
typically used in combination with --stat, it makes sense to treat them
both the same way.

Note that we have to tweak t4013's setup a bit to test this case, as the
existing merges do not have any --summary results against their first
parent. But since the merge at the tip of 'master' does add and remove
files with respect to the second parent, we can just make a reversed
doppelganger merge where the parents are swapped.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:18:53 -08:00
Jeff King
8290faa077 combine-diff: treat --shortstat like --stat
The --stat of a combined diff is defined as the first-parent stat,
going all the way back to 965f803c32 (combine-diff: show diffstat with
the first parent., 2006-04-17).

Naturally, we gave --numstat the same treatment in 74e2abe5b7 (diff
--numstat, 2006-10-12).

But --shortstat, which is really just the final line of --stat, does
nothing, which produces confusing results:

  $ git show --oneline --stat eab7584e37
  eab7584e37 Merge branch 'en/show-ref-doc-fix'

   Documentation/git-show-ref.txt | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

  $ git show --oneline --shortstat eab7584e37
  eab7584e37 Merge branch 'en/show-ref-doc-fix'

  [nothing! We'd expect to see the "1 file changed..." line]

This patch teaches combine-diff to treats the two formats identically.

Reported-by: David Turner <novalis@novalis.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:18:53 -08:00
Jeff King
8817f0cc67 combine-diff: factor out stat-format mask
There are several conditionals in the combine diff code that check if
we're doing --stat or --numstat output. Since these must all remain in
sync, let's pull them out into a separate bit-mask.

Arguably this could go into diff.h along with the other DIFF_FORMAT
macros, but it's not clear that the definition of "which formats are
stat" is universal (e.g., does --dirstat count? --summary?). So let's
keep this local to combine-diff.c, where the meaning is more clearly
"stat formats that combine-diff supports".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 12:18:52 -08:00
Jeff King
48edf3a02a diff: clear emitted_symbols flag after use
There's an odd bug when "log --color-moved" is used with the combination
of "--cc --stat -p": the stat for merge commits is erroneously shown
with the diff of the _next_ commit.

The included test demonstrates the issue. Our history looks something
like this:

  A-B-M--D
   \ /
    C

When we run "git log --cc --stat -p --color-moved" starting at D, we get
this sequence of events:

  1. The diff for D is using -p, so diff_flush() calls into
     diff_flush_patch_all_file_pairs(). There we see that o->color_moved
     is in effect, so we point o->emitted_symbols to a static local
     struct, causing diff_flush_patch() to queue the symbols instead of
     actually writing them out.

     We then do our move detection, emit the symbols, and clear the
     struct. But we leave o->emitted_symbols pointing to our struct.

  2. Next we compute the diff for M. This is a merge, so we use the
     combined diff code. In find_paths_generic(), we compute the
     pairwise diff between each commit and its parent. Normally this is
     done with DIFF_FORMAT_NO_OUTPUT, since we're just looking for
     intersecting paths. But since "--stat --cc" shows the first-parent
     stat, and since we're computing that diff anyway, we enable
     DIFF_FORMAT_DIFFSTAT for the first parent. This outputs the stat
     information immediately, saving us from running a separate
     first-parent diff later.

     But where does that output go? Normally it goes directly to stdout,
     but because o->emitted_symbols is set, we queue it. As a result, we
     don't actually print the diffstat for the merge commit (yet), which
     is wrong.

  3. Next we compute the diff for C. We're actually showing a patch
     again, so we end up in diff_flush_patch_all_file_pairs(), but this
     time we have the queued stat from step 2 waiting in our struct.

     We add new elements to it for C's diff, and then flush the whole
     thing. And we see the diffstat from M as part of C's diff, which is
     wrong.

So triggering the bug really does require the combination of all of
those options.

To fix it, we can simply restore o->emitted_symbols to NULL after
flushing it, so that it does not affect anything outside of
diff_flush_patch_all_file_pairs(). This intuitively makes sense, since
nobody outside of that function is going to bother flushing it, so we
would not want them to write to it either.

In fact, we could take this a step further and turn the local "esm"
struct into a non-static variable that goes away after the function
ends. However, since it contains a dynamically sized array, we benefit
from amortizing the cost of allocations over many calls. So we'll leave
it as static to retain that benefit.

But let's push the zero-ing of esm.nr into the conditional for "if
(o->emitted_symbols)" to make it clear that we do not expect esm to hold
any values if we did not just try to use it. With the code as it is
written now, if we did encounter such a case (which I think would be a
bug), we'd silently leak those values without even bothering to display
them. With this change, we'd at least eventually show them, and somebody
would notice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 11:59:07 -08:00
Jeff King
426fd36f4a t4006: resurrect commented-out tests
This set of tests was added by 4434e6ba6c (tests: check --[short]stat
output after chmod, 2012-05-01), and is primarily about the handling of
binary versus text files.

Later, 74faaa16f0 (Fix "git diff --stat" for interesting - but empty -
file changes, 2012-10-17) changed the stat output so that the empty text
file is mentioned rather than omitted. That commit just comments out
these tests. There's no discussion in the commit message, but the
original email[1] says:

  NOTE! This does break two of our tests, so we clearly did this on
  purpose, or at least tested for it. I just uncommented the subtests
  that this makes irrelevant, and changed the output of another one.

I don't think they're irrelevant, though. We should be testing this
"mode change only" case and making sure that it has the post-74faaa16f0
behavior. So this commit brings back those tests, with the current
expected output.

[1] https://public-inbox.org/git/CA+55aFz88GPJcfMSqiyY+u0Cdm48bEyrsTGxHVJbGsYsDg=Q5w@mail.gmail.com/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 11:58:38 -08:00
Nguyễn Thái Ngọc Duy
f8adbec9fe cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch
By default, index compat macros are off from now on, because they
could hide the_index dependency.

Only those in builtin can use it.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24 11:55:06 -08:00
Ben Peart
8424bfd45b checkout: fix regression in checkout -b on intitial checkout
When doing a 'checkout -b' do a full checkout including updating the working
tree when doing the initial checkout. As the new test involves an filesystem
access, do it later in the sequence to give chance to other cheaper tests to
leave early. This fixes the regression in behavior caused by fa655d8411
(checkout: optimize "git checkout -b <new_branch>", 2018-08-16).

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:22:48 -08:00
Ben Peart
91e3d7ca9b checkout: add test demonstrating regression with checkout -b on initial commit
Commit fa655d8411 (checkout: optimize "git checkout -b <new_branch>",
2018-08-16) introduced an unintentional change in behavior for 'checkout -b'
after doing 'clone --no-checkout'.  Add a test to demonstrate the changed
behavior to be used in a later patch to verify the fix.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:22:47 -08:00
Ævar Arnfjörð Bjarmason
49bbc57a57 commit-graph write: emit a percentage for all progress
Follow-up 01ca387774 ("commit-graph: split up close_reachable()
progress output", 2018-11-19) by making the progress bars in
close_reachable() report a completion percentage. This fixes the last
occurrence where in the commit graph writing where we didn't report
that.

The change in 01ca387774 split up the 1x progress bar in
close_reachable() into 3x, but left them as dumb counters without a
percentage completion. Fixing that is easy, and the only reason it
wasn't done already is because that commit was rushed in during the
v2.20.0 RC period to fix the unrelated issue of over-reporting commit
numbers. See [1] and follow-ups for ML activity at the time and [2]
for an alternative approach where the progress bars weren't split up.

Now for e.g. linux.git we'll emit:

    $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    Finding commits for commit graph among packed objects: 100% (6529159/6529159), done.
    Expanding reachable commits in commit graph: 100% (815990/815980), done.
    Computing commit graph generation numbers: 100% (815983/815983), done.
    Writing out commit graph in 4 passes: 100% (3263932/3263932), done.

1. https://public-inbox.org/git/20181119202300.18670-1-avarab@gmail.com/
2. https://public-inbox.org/git/20181122153922.16912-11-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:14:08 -08:00
Ævar Arnfjörð Bjarmason
890226ccb5 commit-graph write: add itermediate progress
Add progress output to sections of code between "Annotating[...]" and
"Computing[...]generation numbers". This can collectively take 5-10
seconds on a large enough repository.

On a test repository with I have with ~7 million commits and ~50
million objects we'll now emit:

    $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    Finding commits for commit graph among packed objects: 100% (124763727/124763727), done.
    Loading known commits in commit graph: 100% (18989461/18989461), done.
    Expanding reachable commits in commit graph: 100% (18989507/18989461), done.
    Clearing commit marks in commit graph: 100% (18989507/18989507), done.
    Counting distinct commits in commit graph: 100% (18989507/18989507), done.
    Finding extra edges in commit graph: 100% (18989507/18989507), done.
    Computing commit graph generation numbers: 100% (7250302/7250302), done.
    Writing out commit graph in 4 passes: 100% (29001208/29001208), done.

Whereas on a medium-sized repository such as linux.git these new
progress bars won't have time to kick in and as before and we'll still
emit output like:

    $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    Finding commits for commit graph among packed objects: 100% (6529159/6529159), done.
    Expanding reachable commits in commit graph: 815990, done.
    Computing commit graph generation numbers: 100% (815983/815983), done.
    Writing out commit graph in 4 passes: 100% (3263932/3263932), done.

The "Counting distinct commits in commit graph" phase will spend most
of its time paused at "0/*" as we QSORT(...) the list. That's not
optimal, but at least we don't seem to be stalling anymore most of the
time.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:14:08 -08:00
Ævar Arnfjörð Bjarmason
e59c615e3c commit-graph write: remove empty line for readability
Remove the empty line between a QSORT(...) and the subsequent oideq()
for-loop. This makes it clearer that the QSORT(...) is being done so
that we can run the oideq() loop on adjacent OIDs. Amends code added
in 08fd81c9b6 ("commit-graph: implement write_commit_graph()",
2018-04-02).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:14:08 -08:00
Ævar Arnfjörð Bjarmason
7c7b8a7fc7 commit-graph write: add more descriptive progress output
Make the progress output shown when we're searching for commits to
include in the graph more descriptive. This amends code I added in
7b0f229222 ("commit-graph write: add progress output", 2018-09-17).

Now, on linux.git, we'll emit this sort of output in the various modes
we support:

    $ git commit-graph write
    Finding commits for commit graph among packed objects: 100% (6529159/6529159), done.
    [...]

    # Actually we don't emit this since this takes almost no time at
    # all. But if we did (s/_delayed//) we'd show:
    $ git for-each-ref --format='%(objectname)' | git commit-graph write --stdin-commits
    Finding commits for commit graph from 630 refs: 100% (630/630), done.
    [...]

    $ (cd .git/objects/pack/ && ls *idx) | git commit-graph write --stdin-pack
    Finding commits for commit graph in 3 packs: 6529159, done.
    [...]

The middle on of those is going to be the output users might see in
practice, since it'll be emitted when they get the commit graph via
gc.writeCommitGraph=true. But as noted above you need a really large
number of refs for this message to show. It'll show up on a test
repository I have with ~165k refs:

    Finding commits for commit graph from 165203 refs: 100% (165203/165203), done.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:14:08 -08:00
Ævar Arnfjörð Bjarmason
d9b1b309cf commit-graph write: show progress for object search
Show the percentage progress for the "Finding commits for commit
graph" phase for the common case where we're operating on all packs in
the repository, as "commit-graph write" or "gc" will do.

Before we'd emit on e.g. linux.git with "commit-graph write":

    Finding commits for commit graph: 6529159, done.
    [...]

And now:

    Finding commits for commit graph: 100% (6529159/6529159), done.
    [...]

Since the commit graph only includes those commits that are packed
(via for_each_packed_object(...)) the approximate_object_count()
returns the actual number of objects we're going to process.

Still, it is possible due to a race with "gc" or another process
maintaining packs that the number of objects we're going to process is
lower than what approximate_object_count() reported. In that case we
don't want to stop the progress bar short of 100%. So let's make sure
it snaps to 100% at the end.

The inverse case is also possible and more likely. I.e. that a new
pack has been added between approximate_object_count() and
for_each_packed_object(). In that case the percentage will go beyond
100%, and we'll do nothing to snap it back to 100% at the end.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:14:08 -08:00
Ævar Arnfjörð Bjarmason
289447397c commit-graph write: more descriptive "writing out" output
Make the "Writing out" part of the progress output more
descriptive. Depending on the shape of the graph we either make 3 or 4
passes over it.

Let's present this information to the user in case they're wondering
what this number, which is much larger than their number of commits,
has to do with writing out the commit graph. Now e.g. on linux.git we
emit:

    $ ~/g/git/git --exec-path=$HOME/g/git -C ~/g/linux commit-graph write
    Finding commits for commit graph: 6529159, done.
    Expanding reachable commits in commit graph: 815990, done.
    Computing commit graph generation numbers: 100% (815983/815983), done.
    Writing out commit graph in 4 passes: 100% (3263932/3263932), done.

A note on i18n: Why are we using the Q_() function and passing a
number & English text for a singular which'll never be used? Because
the plural rules of translated languages may not match those of
English, and to use the plural function we need to use this format.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:14:08 -08:00
Ævar Arnfjörð Bjarmason
53035c4f0b commit-graph write: add "Writing out" progress output
Add progress output to be shown when we're writing out the
commit-graph, this adds to the output already added in 7b0f229222
("commit-graph write: add progress output", 2018-09-17).

As noted in that commit most of the progress output isn't displayed on
small repositories, but before this change we'd noticeably hang for
2-3 seconds at the end on medium sized repositories such as linux.git.

Now we'll instead show output like this, and reduce the
human-observable times at which we're not producing progress output:

    $ ~/g/git/git --exec-path=$HOME/g/git -C ~/g/2015-04-03-1M-git commit-graph write
    Finding commits for commit graph: 13064614, done.
    Expanding reachable commits in commit graph: 1000447, done.
    Computing commit graph generation numbers: 100% (1000447/1000447), done.
    Writing out commit graph: 100% (3001341/3001341), done.

This "Writing out" number is 3x or 4x the number of commits, depending
on the graph we're processing. A later change will make this explicit
to the user.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:14:08 -08:00
SZEDER Gábor
857ba928a4 commit-graph: don't call write_graph_chunk_extra_edges() unnecessarily
The optional 'Extra Edge List' chunk of the commit graph file stores
parent information for commits with more than two parents.  Since the
chunk is optional, write_commit_graph() looks through all commits to
find those with more than two parents, and then writes the commit
graph file header accordingly, i.e. if there are no such commits, then
there won't be a 'Extra Edge List' chunk written, only the three
mandatory chunks.

However, when it later comes to writing actual chunk data,
write_commit_graph() unconditionally invokes
write_graph_chunk_extra_edges(), even when it was decided earlier that
that chunk won't be written.  Strictly speaking there is no bug here,
because write_graph_chunk_extra_edges() won't write anything if it
doesn't find any commits with more than two parents, but then it
unnecessarily and in vain looks through all commits once again in
search for such commits.

Don't call write_graph_chunk_extra_edges() when that chunk won't be
written to spare an unnecessary iteration over all commits.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 13:12:56 -08:00
Jean-Noël Avila
ba170517be doc: tidy asciidoc style
This mainly refers to enforcing indentation on additional lines of
items of lists.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-23 11:37:29 -08:00
Stephen P. Smith
038a878810 Add 'human' date format documentation
Display date and time information in a format similar to how people
write dates in other contexts. If the year isn't specified then, the
reader infers the date is given is in the current year.

By not displaying the redundant information, the reader concentrates
on the information that is different. The patch reports relative dates
based on information inferred from the date on the machine running the
git command at the time the command is executed.

While the format is more useful to humans by dropping inferred
information, there is nothing that makes it actually human. If the
'relative' date format wasn't already implemented then using
'relative' would have been appropriate.

Signed-off-by: Stephen P. Smith <ischis2@cox.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-22 14:16:17 -08:00