Commit Graph

55449 Commits

Author SHA1 Message Date
SZEDER Gábor
545dc345eb progress: break too long progress bar lines
Some of the recently added progress indicators have quite long titles,
which might be even longer when translated to some languages, and when
they are shown while operating on bigger repositories, then the
progress bar grows longer than the default 80 column terminal width.

When the progress bar exceeds the width of the terminal it gets
line-wrapped, and after that the CR at the end doesn't return to the
beginning of the progress bar, but to the first column of its last
line.  Consequently, the first line of the previously shown progress
bar is not overwritten by the next, and we end up with a bunch of
truncated progress bar lines scrolling past:

  $ LANG=es_ES.UTF-8 git commit-graph write
  Encontrando commits para commit graph entre los objetos empaquetados:   2% (1599
  Encontrando commits para commit graph entre los objetos empaquetados:   3% (1975
  Encontrando commits para commit graph entre los objetos empaquetados:   4% (2633
  Encontrando commits para commit graph entre los objetos empaquetados:   5% (3292
  [...]

Prevent this by breaking progress bars after the title once they
exceed the width of the terminal, so the counter and optional
percentage and throughput, i.e. all changing parts, are on the last
line.  Subsequent updates will from then on only refresh the changing
parts, but not the title, and it will look like this:

  $ LANG=es_ES.UTF-8 ~/src/git/git commit-graph write
  Encontrando commits para commit graph entre los objetos empaquetados:
    100% (6584502/6584502), listo.
  Calculando números de generación de commit graph: 100% (824705/824705), listo.
  Escribiendo commit graph en 4 pasos: 100% (3298820/3298820), listo.

Note that the number of columns in the terminal is cached by
term_columns(), so this might not kick in when it should when a
terminal window is resized while the operation is running.
Furthermore, this change won't help if the terminal is so narrow that
the counters don't fit on one line, but I would put this in the "If it
hurts, don't do it" box.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-15 12:11:25 +09:00
SZEDER Gábor
9f1fd84e15 progress: clear previous progress update dynamically
When the progress bar includes throughput, its length can shorten as
the unit of display changes from KiB to MiB.  To cover up remnants of
the previous progress bar when such a change of units happens we
always print three spaces at the end of the progress bar.

Alas, covering only three characters is not quite enough: when both
the total and the throughput happen to change units from KiB to MiB in
the same update, then the progress bar's length is shortened by four
characters (or maybe even more!):

  Receiving objects:  25% (2901/11603), 772.01 KiB | 733.00 KiB/s
  Receiving objects:  27% (3133/11603), 1.43 MiB | 1.16 MiB/s   s

and a stray 's' is left behind.

So instead of hard-coding the three characters to cover, let's compare
the length of the current progress bar with the previous one, and
cover up as many characters as needed.

Sure, it would be much simpler to just print more spaces at the end of
the progress bar, but this approach is more future-proof, and it won't
print extra spaces when none are needed, notably when the progress bar
doesn't show throughput and thus never shrinks.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-15 12:11:25 +09:00
Johannes Schindelin
b02e7d5d70 tests: disallow the use of abbreviated options (by default)
Git's command-line parsers support uniquely abbreviated options, e.g.
`git init --ba` would automatically expand `--ba` to `--bare`.

This is a very convenient feature in every day life for Git users, in
particular when tab completion is not available.

However, it is not a good idea to rely on that in Git's test suite, as
something that is a unique abbreviation of a command line option today
might no longer be a unique abbreviation tomorrow.

For example, if a future contribution added a new mode
`git init --babyproofing` and a previously-introduced test case used the
fact that `git init --ba` expanded to `git init --bare`, that future
contribution would now have to touch seemingly unrelated tests just to
keep the test suite from failing.

So let's disallow abbreviated options in the test suite by default.

Note: for ease of implementation, this patch really only touches the
`parse-options` machinery: more and more hand-rolled option parsers are
converted to use that internal API, and more and more scripts are
converted to built-ins (naturally using the parse-options API, too), so
in practice this catches most issues, and is definitely the biggest bang
for the buck.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-15 11:54:04 +09:00
Jeff King
999b951b28 progress: use xmalloc/xcalloc
Since the early days of Git, the progress code allocates its struct with
a bare malloc(), not xmalloc(). If the allocation fails, we just avoid
showing progress at all.

While perhaps a noble goal not to fail the whole operation because of
optional progress, in practice:

  1. Any failure to allocate a few dozen bytes here means critical path
     allocations are likely to fail, too.

  2. These days we use a strbuf for throughput progress (and there's a
     patch under discussion to do the same for non-throughput cases,
     too). And that uses xmalloc() under the hood, which means we'd
     still die on some allocation failures.

Let's switch to xmalloc(). That makes us consistent with the rest of Git
and makes it easier to audit for other (less careful) bare mallocs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12 13:34:17 +09:00
Jeff King
36c8319724 xdiff: use xmalloc/xrealloc
Most of xdiff uses a bare malloc() to allocate memory, and returns an
error when we get NULL. However, there are a few spots which don't check
the return value and may segfault, including at least xdl_merge() and
xpatience.c's find_longest_common_sequence().

Let's use xmalloc() everywhere instead, so that we get a graceful die()
for these cases, without having to do further auditing. This does mean
the existing cases which check errors will now die() instead of
returning an error up the stack. But:

  - that's how the rest of Git behaves already for malloc errors

  - all of the callers of xdi_diff(), etc, die upon seeing an error

So while we might one day want to fully lib-ify the diff code and make
it possible to use as part of a long-running process, we're not close to
that now. And because we're just tweaking the xdl_malloc() macro here,
we're not really moving ourselves any further away from that. We
could, for example, simplify some of the functions which handle malloc()
errors which can no longer occur. But that would probably be taking us
in the wrong direction.

This also makes our malloc handling more consistent with the rest of
Git, including enforcing GIT_ALLOC_LIMIT and trying to reclaim pack
memory when needed.

Reported-by: 王健强 <jianqiang.wang@securitygossip.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12 13:34:17 +09:00
Jeff King
b46054b374 xdiff: use git-compat-util
Since the xdiff library was not originally part of Git, it does its own
system includes. Let's instead use git-compat-util, which has two
benefits:

  1. It adjusts for any system-specific quirks in how or what we should
     include (though xdiff's needs are light enough that this hasn't
     been a problem in the past).

  2. It lets us use wrapper functions like xmalloc().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12 13:34:17 +09:00
Jeff King
73a5faf017 test-prio-queue: use xmalloc
test-prio-queue.c doesn't check the return value of malloc, and could
segfault.

It's unlikely for this to matter in practice; it's a small allocation,
and this code isn't even installed alongside the rest of Git. But let's
use xmalloc(), which makes auditing for other accidental uses of bare
malloc() easier.

Reported-by: 王健强 <jianqiang.wang@securitygossip.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12 13:34:17 +09:00
Kyle Meyer
d8083e4180 t3000 (ls-files -o): widen description to reflect current tests
Remove the mention of symlinks from the test description because
several tests that are not related to symlinks have been added since
this file was introduced long ago.

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12 10:50:04 +09:00
Johannes Schindelin
3a7b45a623 untracked cache: fix off-by-one
In f9e6c64958 (untracked cache: load from UNTR index extension,
2015-03-08), code was added to read back the untracked cache from an
index extension.

Probably in the endeavor to avoid the `calloc()` implied by
`FLEX_ALLOC_STR()` (it is hard to know why exactly, the commit message
of that commit is a bit parsimonious with information), it calls
`malloc()` manually and then `memcpy()`s the bits and pieces into place.

It allocates the size of `struct untracked_cache_dir` plus the string
length of the untracked file name, then copies the information in two
steps: first the fixed-size metadata, then the name. And here lies the
rub: it includes the trailing NUL byte in the name.

If `FLEX_ARRAY` is defined as 0, this results in a buffer overrun.

To fix this, let's just add 1, for the trailing NUL byte. Technically,
this overallocates on platforms where `FLEX_ARRAY` is 1, but it should
not matter much in reality, as `malloc()` usually overallocates anyway,
unless the size to allocate aligns exactly with some internal chunk size
(see below for more on that).

The real strange thing is that neither valgrind nor DrMemory catches
this bug. In this developer's tests, a `memcpy()` (but not a
`memset()`!) could write up to 4 bytes after the allocated memory range
before valgrind would start reporting an issue.

However, when running Git built with nedmalloc as allocator, under rare
conditions (and inconsistently at that), this bug triggered an `abort()`
because nedmalloc rounds up the size to be `malloc()`ed to a multiple of
a certain chunk size, then adds a few bytes to be used for storing some
internal state. If there is no rounding up to do (because the size is
already a multiple of that chunk size), and if the buffer is overrun as
in the code patched in this commit, the internal state is corrupted.

The scenario that triggered this here bug fix entailed a git.git
checkout with an extra copy of the source code in an untracked
subdirectory, meaning that there was an untracked subdirectory called
"thunderbird-patch-inline" whose name's length is exactly 24 bytes,
which, added to the size of above-mentioned `struct untracked_cache_dir`
that weighs in with 104 bytes on a 64-bit system, amounts to 128,
aligning perfectly with nedmalloc's chunk size.

As there is no obvious way to trigger this bug reliably, on all
platforms supported by Git, and as the bug is obvious enough, this patch
comes without a regression test.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12 10:49:40 +09:00
Denton Liu
b57e8119e6 submodule: teach set-branch subcommand
This teaches git-submodule the set-branch subcommand which allows the
branch of a submodule to be set through a porcelain command without
having to manually manipulate the .gitmodules file.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-10 12:07:16 +09:00
Todd Zullinger
f64a21bd82 Documentation/git-show-branch: avoid literal {apostrophe}
The {apostrophe} was needed at the time of a521845800 ("Documentation:
remove stray backslash in show-branch discussion", 2010-08-20).  All
other uses of {apostrophe} were removed in 6cf378f0cb ("docs: stop using
asciidoc no-inline-literal", 2012-04-26).

Unfortunately, the {apostrophe} is rendered literally with Asciidoctor
(at least with 1.5.5-2.0.3).  Avoid this by using single-quotes.

Escaping the leading single-quote allows the content to render properly
in AsciiDoc and Asciidoctor.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-10 12:05:03 +09:00
Junio C Hamano
e35b8cb8e2 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-10 02:19:09 +09:00
Junio C Hamano
32414ceb85 Merge branch 'jt/submodule-fetch-errmsg'
Error message update.

* jt/submodule-fetch-errmsg:
  submodule: explain first attempt failure clearly
2019-04-10 02:14:26 +09:00
Junio C Hamano
c063a537be Merge branch 'jk/sha1dc'
Build update for SHA-1 with collision detection.

* jk/sha1dc:
  Makefile: fix unaligned loads in sha1dc with UBSan
2019-04-10 02:14:26 +09:00
Junio C Hamano
2d3372883f Merge branch 'jk/promote-ggg'
Suggest GitGitGadget instead of submitGit as a way to submit
patches based on GitHub PR to us.

* jk/promote-ggg:
  point pull requesters to GitGitGadget
2019-04-10 02:14:25 +09:00
Junio C Hamano
7caf4cfb81 Merge branch 'ar/t4150-remove-cruft'
Test cleanup.

* ar/t4150-remove-cruft:
  t4150: remove unused variable
2019-04-10 02:14:25 +09:00
Junio C Hamano
fa1b86e457 Merge branch 'js/rebase-deprecate-preserve-merges'
"git rebase --rebase-merges" replaces its old "--preserve-merges"
option; the latter is now marked as deprecated.

* js/rebase-deprecate-preserve-merges:
  rebase: deprecate --preserve-merges
2019-04-10 02:14:24 +09:00
Junio C Hamano
20fe798b1b Merge branch 'ms/worktree-add-atomic-mkdir'
"git worktree add" used to do a "find an available name with stat
and then mkdir", which is race-prone.  This has been fixed by using
mkdir and reacting to EEXIST in a loop.

* ms/worktree-add-atomic-mkdir:
  worktree: fix worktree add race
2019-04-10 02:14:24 +09:00
Junio C Hamano
31df2c1019 Merge branch 'jk/line-log-with-patch'
"git log -L<from>,<to>:<path>" with "-s" did not suppress the patch
output as it should.  This has been corrected.

* jk/line-log-with-patch:
  line-log: detect unsupported formats
  line-log: suppress diff output with "-s"
2019-04-10 02:14:23 +09:00
Junio C Hamano
ac9e40e8ef Merge branch 'ra/t3600-test-path-funcs'
A GSoC micro.

* ra/t3600-test-path-funcs:
  t3600: use helpers to replace test -d/f/e/s <path>
  t3600: modernize style
  test functions: add function `test_file_not_empty`
2019-04-10 02:14:23 +09:00
Junio C Hamano
917f2cd1c2 Merge branch 'nd/rewritten-ref-is-per-worktree'
"git rebase" uses the refs/rewritten/ hierarchy to store its
intermediate states, which inherently makes the hierarchy per
worktree, but it didn't quite work well.

* nd/rewritten-ref-is-per-worktree:
  Make sure refs/rewritten/ is per-worktree
  files-backend.c: reduce duplication in add_per_worktree_entries_to_dir()
  files-backend.c: factor out per-worktree code in loose_fill_ref_dir()
2019-04-10 02:14:23 +09:00
Junio C Hamano
1828e52efc Merge branch 'jh/resize-convert-scratch-buffer'
When the "clean" filter can reduce the size of a huge file in the
working tree down to a small "token" (a la Git LFS), there is no
point in allocating a huge scratch area upfront, but the buffer is
sized based on the original file size.  The convert mechanism now
allocates very minimum and reallocates as it receives the output
from the clean filter process.

* jh/resize-convert-scratch-buffer:
  convert: avoid malloc of original file size
2019-04-10 02:14:22 +09:00
Junio C Hamano
b582c1681e Merge branch 'dl/ignore-docs'
Doc update.

* dl/ignore-docs:
  docs: move core.excludesFile from git-add to gitignore
  git-clean.txt: clarify ignore pattern files
2019-04-10 02:14:22 +09:00
Junio C Hamano
081b08c45c Merge branch 'ja/dir-rename-doc-markup-fix'
Doc update.

* ja/dir-rename-doc-markup-fix:
  Doc: fix misleading asciidoc formating
2019-04-10 02:14:21 +09:00
Junio C Hamano
0b2094b06b Merge branch 'dl/reset-doc-no-wrt-abbrev'
Doc update.

* dl/reset-doc-no-wrt-abbrev:
  git-reset.txt: clarify documentation
2019-04-10 02:14:21 +09:00
Johannes Schindelin
dbe7b41019 t3301: fix false negative
In 6956f858f6 (notes: implement helpers needed for note copying during
rewrite, 2010-03-12), we introduced a test case that verifies that the
config setting `notes.rewriteRef` can be overridden via the environment
variable `GIT_NOTES_REWRITE_REF`.

Back when it was introduced, it relied on a side effect of an earlier
test case that configured `core.noteRef` to point to `refs/notes/other`.

In 908a320363 (t3301: modernize style, 2014-11-12), this side effect was
removed.

The test case *still* passed, but for the wrong reason: we no longer
overrode the rewrite ref, but there simply was nothing to rewrite
anymore, as the overridden notes ref was "modernized" away.

Let's let that test case pass for the correct reason again.

To make sure of that, let's change the idea of the original test case:
it configured `notes.rewriteRef` to point to the actual notes ref,
forced that to be ignored and then verified that the notes were *not*
rewritten.

By turning that idea upside down (configure the `notes.rewriteRef` to
another notes ref, override it via the environment variable to force the
notes to be copied, and then verify that the notes *were* rewritten), we
make it much harder for that test case to pass for the wrong reason.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-09 20:10:35 +09:00
Philip Oakley
ffea0248bf describe doc: remove '7-char' abbreviation reference
While the minimum is 7-char, the unambiguous length can be longer.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:24:51 +09:00
Philip Oakley
fe61ccbc35 rerere doc: quote rerere.enabled
Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:23:25 +09:00
SZEDER Gábor
a544fb08f8 blame: default to HEAD in a bare repo when no start commit is given
When 'git blame' is invoked without specifying the commit to start
blaming from, it starts from the given file's state in the work tree.
However, when invoked in a bare repository without a start commit,
then there is no work tree state to start from, and it dies with the
following error message:

  $ git rev-parse --is-bare-repository
  true
  $ git blame file.c
  fatal: this operation must be run in a work tree

This is misleading, because it implies that 'git blame' doesn't work
in bare repositories at all, but it does, in fact, work just fine when
it is given a commit to start from.

We could improve the error message, of course, but let's just default
to HEAD in a bare repository instead, as most likely that is what the
user wanted anyway (if they wanted to start from an other commit, then
they would have specified that in the first place).

'git annotate' is just a thin wrapper around 'git blame', so in the
same situation it printed the same misleading error message, and this
patch fixes it, too.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:02:26 +09:00
Thomas Gummerer
7cb7283adb ls-files: use correct format string
struct stat_data and struct cache_time both use unsigned ints for all
their members.  However the format string for 'git ls-files --debug'
currently uses %d for formatting these numbers.  This means that we
potentially print these values incorrectly if they are greater than
INT_MAX.

This has been the case since the --debug option was introduced in 'git
ls-files' in 8497421715 ("ls-files: learn a debugging dump format",
2010-07-31).

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:52 +09:00
Ævar Arnfjörð Bjarmason
0044f7700f gc docs: remove incorrect reference to gc.auto=0
The chance of a repository being corrupted due to a "gc" has nothing
to do with whether or not that "gc" was invoked via "gc --auto", but
whether there's other concurrent operations happening.

This is already noted earlier in the paragraph, so there's no reason
to suggest this here. The user can infer from the rest of the
documentation that "gc" will run automatically unless gc.auto=0 is
set, and we shouldn't confuse the issue by implying that "gc --auto"
is somehow more prone to produce corruption than a normal "gc".

Well, it is in the sense that a blocking "gc" would stop you from
doing anything else in *that* particular terminal window, but users
are likely to have another window, or to be worried about how
concurrent "gc" on a server might cause corruption.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:10 +09:00
Ævar Arnfjörð Bjarmason
daecbf2261 gc docs: clarify that "gc" doesn't throw away referenced objects
Amend the "NOTES" section to fix up wording that's been with us since
3ffb58be0a ("doc/git-gc: add a note about what is collected",
2008-04-23).

I can't remember when/where anymore (I think Freenode #Git), but at
some point I was having a conversation with someone who was convinced
that "gc" would prune things only referenced by e.g. refs/pull/*, and
pointed to this section as proof.

It turned out that they'd read the "branches and tags" wording here
and thought just refs/{heads,tags}/* and refs/remotes/* etc. would be
kept, which is what we enumerate explicitly.

So let's say "other refs", even though just above we say "objects that
are referenced anywhere in your repository".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:09 +09:00
Ævar Arnfjörð Bjarmason
7384504881 gc docs: note "gc --aggressive" in "fast-import"
Amend the "PACKFILE OPTIMIZATION" section in "fast-import" to explain
that simply running "git gc --aggressive" after a "fast-import" should
properly optimize the repository. This is simpler and more effective
than the existing "repack" advice (which I'm keeping as it helps
explain things) because it e.g. also packs the newly imported refs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:09 +09:00
Ævar Arnfjörð Bjarmason
22d4e3bd19 gc docs: downplay the usefulness of --aggressive
The existing "gc --aggressive" docs come just short of recommending to
users that they run it regularly. I've personally talked to many users
who've taken these docs as an advice to use this option, and have,
usually it's (mostly) a waste of time.

So let's clarify what it really does, and let the user draw their own
conclusions.

Let's also clarify the "The effects [...] are persistent" to
paraphrase a brief version of Jeff King's explanation at [1].

1. https://public-inbox.org/git/20190318235356.GK29661@sigill.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:09 +09:00
Ævar Arnfjörð Bjarmason
080a4480a0 gc docs: note how --aggressive impacts --window & --depth
Since 07e7dbf0db (gc: default aggressive depth to 50, 2016-08-11) we
somewhat confusingly use the same depth under --aggressive as we do by
default.

As noted in that commit that makes sense, it was wrong to make more
depth the default for "aggressive", and thus save disk space at the
expense of runtime performance, which is usually the opposite of
someone who'd like "aggressive gc" wants.

But that's left us with a mostly-redundant configuration variable, so
let's clearly note in its documentation that it doesn't change the
default.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:09 +09:00
Ævar Arnfjörð Bjarmason
54d56f52bd gc docs: fix formatting for "gc.writeCommitGraph"
Change the AsciiDoc formatting so that an example of "gc --auto" isn't
rendered as "git-gc(1) --auto", but as "git gc --auto". This is
consistent with the rest of the links and command examples in this
documentation.

The formatting I'm changing was initially introduced in
d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:09 +09:00
Ævar Arnfjörð Bjarmason
d257e0fb0a gc docs: re-flow the "gc.*" section in "config"
Re-flow the "gc.*" section in "config". A previous commit moved this
over from the "gc" docs, but tried to keep as many of the lines
identical to benefit from diff's move detection.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:09 +09:00
Ævar Arnfjörð Bjarmason
b6a8d09f6d gc docs: include the "gc.*" section from "config" in "gc"
Rather than duplicating the documentation for the various "gc" options
let's include the "gc" docs from git-config. They were mostly better
already, and now we don't have the same docs in two places with subtly
different wording.

In the cases where the git-gc(1) docs were saying something the "gc"
docs in git-config(1) didn't cover move the relevant section over to
the git-config(1) docs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 17:01:09 +09:00
Jonathan Tan
7fbbcb21b1 diff: batch fetching of missing blobs
When running a command like "git show" or "git diff" in a partial clone,
batch all missing blobs to be fetched as one request.

This is similar to c0c578b33c ("unpack-trees: batch fetching of missing
blobs", 2017-12-08), but for another command.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 14:04:50 +09:00
SZEDER Gábor
d53ba841d4 progress: assemble percentage and counters in a strbuf before printing
The following patches in this series want to handle the progress bar's
title and changing parts (i.e. the counter and the optional percentage
and throughput combined) differently, and need to know the length
of the changing parts of the previously displayed progress bar.

To prepare for those changes assemble the changing parts in a separate
strbuf kept in 'struct progress' before printing.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-05 15:02:06 +09:00
SZEDER Gábor
9219d12777 progress: make display_progress() return void
Ever since the progress infrastructure was introduced in 96a02f8f6d
(common progress display support, 2007-04-18), display_progress() has
returned an int, telling callers whether it updated the progress bar
or not.  However, this is:

  - useless, because over the last dozen years there has never been a
    single caller that cared about that return value.

  - not quite true, because it doesn't print a progress bar when
    running in the background, yet it returns 1; see 85cb8906f0
    (progress: no progress in background, 2015-04-13).

The related display_throughput() function returned void already upon
its introduction in cf84d51c43 (add throughput to progress display,
2007-10-30).

Let's make display_progress() return void, too.  While doing so
several return statements in display() become unnecessary, remove
them.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-05 15:02:06 +09:00
SZEDER Gábor
37fc8cb15f ci: fix AsciiDoc/Asciidoctor stderr check in the documentation build job
In 'ci/test-documentation.sh' we save the standard error of 'make
doc', and, in an attempt to make sure that neither AsciiDoc nor
Asciidoctor printed any warnings, we check the emptiness of the
resulting file with '! test -s stderr.log'.  This check has never
actually worked, because in our 'ci/*' build scripts we rely on 'set
-e' aborting the build job when a command exits with error, and,
unfortunately, the combination of the two doesn't work as intended.
According to POSIX [1]:

  "The -e setting shall be ignored when executing [...] a pipeline
  beginning with the ! reserved word" [2]

Watch and learn:

  $ echo unexpected >file
  $ ( set -e; ! test -s file ; echo "should not reach this" ) ; echo $?
  should not reach this
  0

This is why we haven't noticed the warnings from Asciidoctor that were
fixed in the first patches of this patch series, though some of them
were already there in the build of v2.18.0-rc0 [3].

Check the emptiness of that file with 'test ! -s' instead, which works
properly with 'set -e':

  $ ( set -e; test ! -s file ; echo "should not reach this" ) ; echo $?
  1

Furthermore, dump the contents of that file to the log for our
convenience, so if it were to unexpectedly end up being non-empty,
then we wouldn't have to scroll through all that long build log
looking for warnings, but could see them right away near the end of
the log.

Note that we are only really interested in the standard error of
AsciiDoc and Asciidoctor, but by saving the stderr of 'make doc' we
also save any error output from the make rules.  Currently there is
only one such line: we build the docs with Asciidoctor right after a
'make clean', meaning that 'make USE_ASCIIDOCTOR=1 doc' always starts
with running 'GIT-VERSION-GEN', which in turn prints the version to
stderr.  A 'sed' command was supposed to remove this version line to
prevent it from triggering that (previously defunct) emptiness check,
but, unfortunately, this command doesn't work as intended, either,
because it leaves the file to be checked intact, but that defunct
emptiness check hid this issue, too...  Furthermore, in the near
future there will be an other line on stderr, because commit
9a71722b4d (Doc: auto-detect changed build flags, 2019-03-17) in the
currently cooking branch 'ma/doc-diff-doc-vs-doctor-comparison' will
print "* new asciidoc flags" at the beginning of both 'make doc'
invokations.

Extend that 'sed' command to remove this line, too, wrap it in a
helper function so the output of both 'make doc' is filtered the same
way, and change its invokation to actually write the logfile to be
checked.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#set

[2] POSIX doesn't discuss the meaning of '! cmd' in case of simple
    commands, but it defines that "A pipeline is a sequence of one or
    more commands separated by the control operator '|'", so
    apparently a simple command is considered as pipeline as well.

    http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_02

[3] https://travis-ci.org/git/git/jobs/385932007#L1463

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-05 14:41:16 +09:00
SZEDER Gábor
615a6c37e1 ci: stick with Asciidoctor v1.5.8 for now
The recent release of Asciidoctor v2.0.0 broke our documentation
build job on Travis CI, where we 'gem install asciidoctor', which
always brings us the latest and (supposedly) greatest.  Alas, we are
not ready for that just yet, because it removed support for DocBook
4.5, and we have been requiring that particular DocBook version to
build 'user-manual.xml' with Asciidoctor, resulting in:

  ASCIIDOC user-manual.xml
  asciidoctor: FAILED: missing converter for backend 'docbook45'. Processing aborted.
  Use --trace for backtrace
  make[1]: *** [user-manual.xml] Error 1

Unfortunately, we can't simply switch to DocBook 5 right away, as
doing so leads to validation errors from 'xmlto', and working around
those leads to yet another errors... [1]

So let's stick with Asciidoctor v1.5.8 (latest stable release before
v2.0.0) in our documentation build job on Travis CI for now, until we
figure out how to deal with the fallout from Asciidoctor v2.0.0.

[1] https://public-inbox.org/git/20190324162131.GL4047@pobox.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-05 14:40:40 +09:00
Denton Liu
0cf2b0a04b cocci: FLEX_ALLOC_MEM to FLEX_ALLOC_STR
Ensure that a FLEX_MALLOC_MEM that uses 'strlen' for its 'len' uses
FLEX_ALLOC_STR instead, since these are equivalent forms.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-04 18:22:30 +09:00
Denton Liu
577314caae midx.c: convert FLEX_ALLOC_MEM to FLEX_ALLOC_STR
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-04 18:22:27 +09:00
Jeff King
8320b1dbe7 revision: use a prio_queue to hold rewritten parents
This patch fixes a quadratic list insertion in rewrite_one() when
pathspec limiting is combined with --parents. What happens is something
like this:

  1. We see that some commit X touches the path, so we try to rewrite
     its parents.

  2. rewrite_one() loops forever, rewriting parents, until it finds a
     relevant parent (or hits the root and decides there are none). The
     heavy lifting is done by process_parent(), which uses
     try_to_simplify_commit() to drop parents.

  3. process_parent() puts any intermediate parents into the
     &revs->commits list, inserting by commit date as usual.

So if commit X is recent, and then there's a large chunk of history that
doesn't touch the path, we may add a lot of commits to &revs->commits.
And insertion by commit date is O(n) in the worst case, making the whole
thing quadratic.

We tried to deal with this long ago in fce87ae538 (Fix quadratic
performance in rewrite_one., 2008-07-12). In that scheme, we cache the
oldest commit in the list; if the new commit to be added is older, we
can start our linear traversal there. This often works well in practice
because parents are older than their descendants, and thus we tend to
add older and older commits as we traverse.

But this isn't guaranteed, and in fact there's a simple case where it is
not: merges. Imagine we look at the first parent of a merge and see a
very old commit (let's say 3 years old). And on the second parent, as we
go back 3 years in history, we might have many commits. That one
first-parent commit has polluted our oldest-commit cache; it will remain
the oldest while we traverse a huge chunk of history, during which we
have to fall back to the slow, linear method of adding to the list.

Naively, one might imagine that instead of caching the oldest commit,
we'd start at the last-added one. But that just makes some cases faster
while making others slower (and indeed, while it made a real-world test
case much faster, it does quite poorly in the perf test include here).
Fundamentally, these are just heuristics; our worst case is still
quadratic, and some cases will approach that.

Instead, let's use a data structure with better worst-case performance.
Swapping out revs->commits for something else would have repercussions
all over the code base, but we can take advantage of one fact: for the
rewrite_one() case, nobody actually needs to see those commits in
revs->commits until we've finished generating the whole list.

That leaves us with two obvious options:

  1. We can generate the list _unordered_, which should be O(n), and
     then sort it afterwards, which would be O(n log n) total. This is
     "sort-after" below.

  2. We can insert the commits into a separate data structure, like a
     priority queue. This is "prio-queue" below.

I expected that sort-after would be the fastest (since it saves us the
extra step of copying the items into the linked list), but surprisingly
the prio-queue seems to be a bit faster.

Here are timings for the new p0001.6 for all three techniques across a
few repositories, as compared to master:

master              cache-last                sort-after              prio-queue
--------------------------------------------------------------------------------------------
GIT_PERF_REPO=git.git
0.52(0.50+0.02)      0.53(0.51+0.02)  +1.9%   0.37(0.33+0.03) -28.8%  0.37(0.32+0.04) -28.8%

GIT_PERF_REPO=linux.git
20.81(20.74+0.07)   20.31(20.24+0.07) -2.4%   0.94(0.86+0.07) -95.5%  0.91(0.82+0.09) -95.6%

GIT_PERF_REPO=llvm-project.git
83.67(83.57+0.09)    4.23(4.15+0.08) -94.9%   3.21(3.15+0.06) -96.2%  2.98(2.91+0.07) -96.4%

A few items to note:

  - the cache-list tweak does improve the bad case for llvm-project.git
    that started my digging into this problem. But it performs terribly
    on linux.git, barely helping at all.

  - the sort-after and prio-queue techniques work well. They approach
    the timing for running without --parents at all, which is what you'd
    expect (see below for more data).

  - prio-queue just barely outperforms sort-after. As I said, I'm not
    really sure why this is the case, but it is. You can see it even
    more prominently in this real-world case on llvm-project.git:

      git rev-list --parents 07ef786652e7 -- llvm/test/CodeGen/Generic/bswap.ll

    where prio-queue routinely outperforms sort-after by about 7%. One
    guess is that the prio-queue may just be more efficient because it
    uses a compact array.

There are three new perf tests:

  - "rev-list --parents" gives us a baseline for running with --parents.
    This isn't sped up meaningfully here, because the bad case is
    triggered only with simplification. But it's good to make sure we
    don't screw it up (now, or in the future).

  - "rev-list -- dummy" gives us a baseline for just traversing with
    pathspec limiting. This gives a lower bound for the next test (and
    it's also a good thing for us to be checking in general for
    regressions, since we don't seem to have any existing tests).

  - "rev-list --parents -- dummy" shows off the problem (and our fix)

Here are the timings for those three on llvm-project.git, before and
after the fix:

Test                                 master              prio-queue
------------------------------------------------------------------------------
0001.3: rev-list --parents           2.24(2.12+0.12)     2.22(2.11+0.11) -0.9%
0001.5: rev-list -- dummy            2.89(2.82+0.07)     2.92(2.89+0.03) +1.0%
0001.6: rev-list --parents -- dummy  83.67(83.57+0.09)   2.98(2.91+0.07) -96.4%

Changes in the first two are basically noise, and you can see we
approach our lower bound in the final one.

Note that we can't fully get rid of the list argument from
process_parents(). Other callers do have lists, and it would be hard to
convert them. They also don't seem to have this problem (probably
because they actually remove items from the list as they loop, meaning
it doesn't grow so large in the first place). So this basically just
drops the "cache_ptr" parameter (which was used only by the one caller
we're fixing here) and replaces it with a prio_queue. Callers are free
to use either data structure, depending on what they're prepared to
handle.

Reported-by: Björn Pettersson A <bjorn.a.pettersson@ericsson.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-04 18:21:54 +09:00
David Aguilar
f57b2ae348 contrib/completion: add smerge to the mergetool completion candidates
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-04 18:21:26 +09:00
David Aguilar
eb12adc74c mergetools: add support for smerge (Sublime Merge)
Teach difftool and mergetool about the Sublime Merge "smerge" command.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-04 18:21:25 +09:00
David Kastrup
f892014943 blame.c: don't drop origin blobs as eagerly
When a parent blob already has chunks queued up for blaming, dropping
the blob at the end of one blame step will cause it to get reloaded
right away, doubling the amount of I/O and unpacking when processing a
linear history.

Keeping such parent blobs in memory seems like a reasonable optimization
that should incur additional memory pressure mostly when processing the
merges from old branches.

Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-03 16:45:26 +09:00
Nguyễn Thái Ngọc Duy
b5a0bd694c read-tree.txt: clarify --reset and worktree changes
The description of --reset stays true to the first implementation in
438195cced (git-read-tree: add "--reset" flag, 2005-06-09). That is,
--reset discards unmerged entries. Or at least true to the commit
message because I can't be sure about read-tree's behavior regarding
local changes.

But in fcc387db9b (read-tree -m -u: do not overwrite or remove untracked
working tree files., 2006-05-17), it is clear that "-m -u" tries to keep
local changes, while --reset is singled out and will keep overwriting
worktree files. It's not stated in the commit message, but it's obvious
from the patch.

I went this far back not because I had a lot of free time, but because I
did not trust my reading of unpack-trees.c code. So far I think the
related changes in history agree with my understanding of the current
code, that "--reset" loses local changes.

This behavior is not mentioned in git-read-tree.txt, even though
old-timers probably can just guess it based on the "reset" name. Update
git-read-tree.txt about this.

Side note. There's another change regarding --reset that is not
obviously about local changes, b018ff6085 (unpack-trees: fix "read-tree
-u --reset A B" with conflicted index, 2012-12-29). But I'm pretty sure
this is about the first function of --reset, to discard unmerged entries
correctly.

PS. The patch changes one more line than necessary because the first
line uses spaces instead of tab.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-02 10:56:02 +09:00