dumb-http walker has been updated to share more error recovery
strategy with the normal codepath.
* jk/http-walker-status-fix:
http: use normalize_curl_result() instead of manual conversion
http: normalize curl results for dumb loose and alternates fetches
http: factor out curl result code normalization
"git multi-pack-index verify" did not scale well with the number of
packfiles, which is being improved.
* jh/midx-verify-too-many-packs:
midx: during verify group objects by packfile to speed verification
midx: add progress indicators in multi-pack-index verify
trace2:data: add trace2 data to midx
progress: add sparse mode to force 100% complete message
A corner case bug in the refs API has been corrected.
* jk/refs-double-abort:
refs/files-backend: don't look at an aborted transaction
refs/files-backend: handle packed transaction prepare failure
The completion helper code now pays attention to repository-local
configuration (when available), which allows --list-cmds to honour
a repository specific setting of completion.commands, for example.
* tz/completion:
completion: use __git when calling --list-cmds
completion: fix multiple command removals
t9902: test multiple removals via completion.commands
git: read local config in --list-cmds
Dev support update to make it easier to compare two formatted
results from our documentation.
* ma/doc-diff-doc-vs-doctor-comparison:
doc-diff: add `--cut-header-footer`
doc-diff: support diffing from/to AsciiDoc(tor)
doc-diff: let `render_tree()` take an explicit directory name
Doc: auto-detect changed build flags
"git checkout -f <branch>" while the index has an unmerged path
incorrectly left some paths in an unmerged state, which has been
corrected.
* nd/checkout-f-while-conflicted-fix:
unpack-trees: fix oneway_merge accidentally carry over stage index
"git format-patch" used overwrite an existing patch/cover-letter
file. A new "--no-clobber" option stops it.
* jc/format-patch-error-check:
format-patch: notice failure to open cover letter for writing
builtin/log: downcase the beginning of error messages
A corner-case object name ambiguity while the sequencer machinery
is working (e.g. "rebase -i -x") has been (half) fixed.
* js/get-short-oid-drop-cache:
get_oid(): when an object was not found, try harder
sequencer: move stale comment into correct location
sequencer: improve error message when an OID could not be parsed
rebase -i: demonstrate obscure loose object cache bug
"git init" forgot to read platform-specific repository
configuration, which made Windows port to ignore settings of
core.hidedotfiles, for example.
* js/init-db-update-for-mingw:
mingw: respect core.hidedotfiles = false in git-init again
Error messages given from the http transport have been updated so
that they can be localized.
* js/remote-curl-i18n:
remote-curl: mark all error messages for translation
remote-http transport did not anonymize URLs reported in its error
messages at places.
* js/anonymize-remote-curl-diag:
curl: anonymize URLs in error messages and warnings
Documentation mark-up fixes.
* ma/asciidoctor-fixes-more:
Documentation: turn middle-of-line tabs into spaces
git-svn.txt: drop escaping '\' that ends up being rendered
git.txt: remove empty line before list continuation
config/fsck.txt: avoid starting line with dash
config/diff.txt: drop spurious backtick
Build fix around use of asciidoctor instead of asciidoc
* ma/asciidoctor-fixes:
asciidoctor-extensions: fix spurious space after linkgit
Documentation/Makefile: add missing dependency on asciidoctor-extensions
Documentation/Makefile: add missing xsl dependencies for manpages
Help developers by making it easier to run most of the tests under
different versions of over-the-wire protocols.
* jt/test-protocol-version:
t5552: compensate for v2 filtering ref adv.
tests: fix protocol version for overspecifications
t5700: only run with protocol version 1
t5512: compensate for v0 only sending HEAD symrefs
t5503: fix overspecification of trace expectation
tests: always test fetch of unreachable with v0
t5601: check ssh command only with protocol v0
tests: define GIT_TEST_PROTOCOL_VERSION
Once upon a time the force flag meant something when writing info/refs,
but it hasn't done anything since 60d0526aaa (Unoptimize info/refs
creation., 2005-09-14).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing objects/info/packs, we use the basename of each pack
(i.e., just the "pack-1234abcd.pack" part). We compute that manually by
adding "objdirlen + 6" to the name.
This _should_ work consistently, as we do not include non-local packs,
meaning everything should be in $objdir/pack/. Before f13d7db4af
(server-info.c: use pack_local like everybody else., 2005-12-05), this
was definitely true, since we computed "local" based on comparing the
objdir string. Since then, we're relying on the code on packfile.c to
match our expectations of p->pack_name and p->local.
I think our expectations do still hold today, but we can be a bit more
defensive by just using pack_basename() to get the base. That
future-proofs us, and should hopefully be more obviously safe to
somebody reading the code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We keep an array of struct pointers, with each one representing a single
packfile. But for some reason there is a nr_alloc parameter inside each
struct, which has never been used.
This is probably cruft left over from development, where we might have
wanted a nr_alloc to dynamically grow the list. But as it turns out, we
do not dynamically grow the list at all, but rather count up the total
number of packs and use that as a maximum size. So while we're thinking
of this, let's add an assert() that documents (and checks!) that our
allocation and fill loops stay in sync.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This old code uses fgets with a fixed-size buffer. Let's use a strbuf
instead, so we don't have to wonder if "1000" is big enough, or what
happens if we see a long line.
This also lets us drop our custom code to trim the newline.
Probably nobody actually cares about the 1000-char limit (after all, the
lines generally only say "P pack-[0-9a-f]{40}.pack"), so this is mostly
just about cleanup/readability.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two exits from the function: either we jump to the out_stale
label or not. But in both exits we repeat our cleanup, and the only
difference is our return value. Let's just use a variable for the return
value to avoid repeating ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're writing out a new objects/info/packs file, we read back the
old one to try to keep the ordering the same. When we see a line
starting with "P", we expect "P pack-1234..." and blindly jump to "line
+ 2" to parse the pack name. If we saw a line with _just_ "P" and
nothing else, we'd jump past the end of the buffer and start reading
arbitrary memory.
This shouldn't be a big attack vector, as the files are local to the
repository and written by us, but it's clearly worth fixing (we do read
remote copies of the file for dumb-http fetches, but using a totally
different parser!).
Let's instead use skip_prefix() here, which avoids pointer arithmetic
altogether. Note that this converts our switch statement to an if/else
chain, making it slightly more verbose. But it will also make it easier
to do a few follow-on cleanups.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can use skip_prefix() and parse_oid_hex() to continuously increment
our pointer, rather than dealing with magic numbers. This also fixes a
few small shortcomings:
- if we see a line with the right prefix, suffix, and length, i.e.
matching /P pack-.{40}.pack\n/, we'll interpret the middle part as
hex without checking if it could be parsed. This could lead to us
looking at uninitialized garbage in the hash array. In practice this
means we'll just make a garbage request to the server which will
fail, though it's interesting that a malicious server could convince
us to leak 40 bytes of uninitialized stack to them.
- the current code is picky about seeing a newline at the end of file,
but we can easily be more liberal
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we have a multi-pack-index that covers many packfiles, we try to
avoid opening the .idx for those packfiles. To do that we feed the pack
name to midx_contains_pack(). But that function wants to see only the
basename, which we compute using strrchr() to find the final slash. But
that leaves an extra "/" at the start of our string.
We can fix this by incrementing the pointer. That also raises the
question of what to do when the name does not have a '/' at all. This
should generally not happen (we always find files in "pack/"), but it
doesn't hurt to be defensive here.
Let's wrap all of that up in a helper function and make it publicly
available, since a later patch will need to use it, too.
The tests don't notice because there's nothing about opening those .idx
files that would cause us to give incorrect output. It's just a little
slower. The new test checks this case by corrupting the covered .idx,
and then making sure we don't complain about it.
We also have to tweak t5570, which intentionally corrupts a .idx file
and expects us to notice it. When run with GIT_TEST_MULTI_PACK_INDEX,
this will fail since we now will (correctly) not bother opening the .idx
at all. We can fix that by unconditionally dropping any midx that's
there, which ensures we'll have to read the .idx.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A midx file (and the struct we parse from it) contains a list of all of
the covered packfiles, mentioned by their ".idx" names (e.g.,
"pack-1234.idx", etc). And thus calls to midx_contains_pack() expect
callers to provide the idx name.
This works for most of the calls, but the one in open_packed_git_1()
tries to feed a packed_git->pack_name, which is the ".pack" name,
meaning we'll never find a match (even if the pack is covered by the
midx).
We can fix this by converting the ".pack" to ".idx" in the caller.
However, that requires allocating a new string. Instead, let's make
midx_contains_pack() a bit friendlier, and allow it take _either_ the
.pack or .idx variant.
All cleverness in the matching code is credited to René. Bugs are mine.
There's no test here, because while this does fix _a_ bug, it's masked
by another bug in that same caller. That will be covered (with a test)
in the next patch.
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cat-file --buffer option is the default already when using
--batch-all-objects. It doesn't hurt to specify it, but it's nice for
the test scripts to model good usage.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no such argument as "--unsorted"; it's spelled "--unordered".
But our test failed to notice that cat-file didn't run at all because:
1. It lost the exit code of git on the left-hand side of a pipe.
2. It was comparing two runs of the broken invocation with and without
a particular config variable (and indeed, both cases produced no
output!).
Let's fix the option, but also tweak the helper function to check the
exit code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can't create a pack revindex if we haven't actually looked at the
index. Normally we would never get as far as creating a revindex without
having already been looking in the pack, so this code never bothered to
double-check that pack->index_data had been loaded.
But with the new multi-pack-index feature, many code paths might not
load the individual pack .idx at all (they'd find objects via the midx
and then open the .pack, but not its index).
This can't yet be triggered in practice, because a bug in the midx code
means we accidentally open up the individual .idx files anyway. But in
preparation for fixing that, let's have the revindex code check that
everything it needs has been loaded.
In most cases this will just be a quick noop. But note that this does
introduce a possibility of error (if we have to open the index and it's
corrupt), so load_pack_revindex() now returns a result code, and callers
need to handle the error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As CodingGuidelines recommends, we do not need an "extern" when
declaring a public function. Let's drop these. Note that we leave the
extern on report_garbage(), as that is actually a function pointer, not
a function itself.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the recently added progress indicators have quite long titles,
which might be even longer when translated to some languages, and when
they are shown while operating on bigger repositories, then the
progress bar grows longer than the default 80 column terminal width.
When the progress bar exceeds the width of the terminal it gets
line-wrapped, and after that the CR at the end doesn't return to the
beginning of the progress bar, but to the first column of its last
line. Consequently, the first line of the previously shown progress
bar is not overwritten by the next, and we end up with a bunch of
truncated progress bar lines scrolling past:
$ LANG=es_ES.UTF-8 git commit-graph write
Encontrando commits para commit graph entre los objetos empaquetados: 2% (1599
Encontrando commits para commit graph entre los objetos empaquetados: 3% (1975
Encontrando commits para commit graph entre los objetos empaquetados: 4% (2633
Encontrando commits para commit graph entre los objetos empaquetados: 5% (3292
[...]
Prevent this by breaking progress bars after the title once they
exceed the width of the terminal, so the counter and optional
percentage and throughput, i.e. all changing parts, are on the last
line. Subsequent updates will from then on only refresh the changing
parts, but not the title, and it will look like this:
$ LANG=es_ES.UTF-8 ~/src/git/git commit-graph write
Encontrando commits para commit graph entre los objetos empaquetados:
100% (6584502/6584502), listo.
Calculando números de generación de commit graph: 100% (824705/824705), listo.
Escribiendo commit graph en 4 pasos: 100% (3298820/3298820), listo.
Note that the number of columns in the terminal is cached by
term_columns(), so this might not kick in when it should when a
terminal window is resized while the operation is running.
Furthermore, this change won't help if the terminal is so narrow that
the counters don't fit on one line, but I would put this in the "If it
hurts, don't do it" box.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the progress bar includes throughput, its length can shorten as
the unit of display changes from KiB to MiB. To cover up remnants of
the previous progress bar when such a change of units happens we
always print three spaces at the end of the progress bar.
Alas, covering only three characters is not quite enough: when both
the total and the throughput happen to change units from KiB to MiB in
the same update, then the progress bar's length is shortened by four
characters (or maybe even more!):
Receiving objects: 25% (2901/11603), 772.01 KiB | 733.00 KiB/s
Receiving objects: 27% (3133/11603), 1.43 MiB | 1.16 MiB/s s
and a stray 's' is left behind.
So instead of hard-coding the three characters to cover, let's compare
the length of the current progress bar with the previous one, and
cover up as many characters as needed.
Sure, it would be much simpler to just print more spaces at the end of
the progress bar, but this approach is more future-proof, and it won't
print extra spaces when none are needed, notably when the progress bar
doesn't show throughput and thus never shrinks.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's command-line parsers support uniquely abbreviated options, e.g.
`git init --ba` would automatically expand `--ba` to `--bare`.
This is a very convenient feature in every day life for Git users, in
particular when tab completion is not available.
However, it is not a good idea to rely on that in Git's test suite, as
something that is a unique abbreviation of a command line option today
might no longer be a unique abbreviation tomorrow.
For example, if a future contribution added a new mode
`git init --babyproofing` and a previously-introduced test case used the
fact that `git init --ba` expanded to `git init --bare`, that future
contribution would now have to touch seemingly unrelated tests just to
keep the test suite from failing.
So let's disallow abbreviated options in the test suite by default.
Note: for ease of implementation, this patch really only touches the
`parse-options` machinery: more and more hand-rolled option parsers are
converted to use that internal API, and more and more scripts are
converted to built-ins (naturally using the parse-options API, too), so
in practice this catches most issues, and is definitely the biggest bang
for the buck.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the early days of Git, the progress code allocates its struct with
a bare malloc(), not xmalloc(). If the allocation fails, we just avoid
showing progress at all.
While perhaps a noble goal not to fail the whole operation because of
optional progress, in practice:
1. Any failure to allocate a few dozen bytes here means critical path
allocations are likely to fail, too.
2. These days we use a strbuf for throughput progress (and there's a
patch under discussion to do the same for non-throughput cases,
too). And that uses xmalloc() under the hood, which means we'd
still die on some allocation failures.
Let's switch to xmalloc(). That makes us consistent with the rest of Git
and makes it easier to audit for other (less careful) bare mallocs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of xdiff uses a bare malloc() to allocate memory, and returns an
error when we get NULL. However, there are a few spots which don't check
the return value and may segfault, including at least xdl_merge() and
xpatience.c's find_longest_common_sequence().
Let's use xmalloc() everywhere instead, so that we get a graceful die()
for these cases, without having to do further auditing. This does mean
the existing cases which check errors will now die() instead of
returning an error up the stack. But:
- that's how the rest of Git behaves already for malloc errors
- all of the callers of xdi_diff(), etc, die upon seeing an error
So while we might one day want to fully lib-ify the diff code and make
it possible to use as part of a long-running process, we're not close to
that now. And because we're just tweaking the xdl_malloc() macro here,
we're not really moving ourselves any further away from that. We
could, for example, simplify some of the functions which handle malloc()
errors which can no longer occur. But that would probably be taking us
in the wrong direction.
This also makes our malloc handling more consistent with the rest of
Git, including enforcing GIT_ALLOC_LIMIT and trying to reclaim pack
memory when needed.
Reported-by: 王健强 <jianqiang.wang@securitygossip.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the xdiff library was not originally part of Git, it does its own
system includes. Let's instead use git-compat-util, which has two
benefits:
1. It adjusts for any system-specific quirks in how or what we should
include (though xdiff's needs are light enough that this hasn't
been a problem in the past).
2. It lets us use wrapper functions like xmalloc().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-prio-queue.c doesn't check the return value of malloc, and could
segfault.
It's unlikely for this to matter in practice; it's a small allocation,
and this code isn't even installed alongside the rest of Git. But let's
use xmalloc(), which makes auditing for other accidental uses of bare
malloc() easier.
Reported-by: 王健强 <jianqiang.wang@securitygossip.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the mention of symlinks from the test description because
several tests that are not related to symlinks have been added since
this file was introduced long ago.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In f9e6c64958 (untracked cache: load from UNTR index extension,
2015-03-08), code was added to read back the untracked cache from an
index extension.
Probably in the endeavor to avoid the `calloc()` implied by
`FLEX_ALLOC_STR()` (it is hard to know why exactly, the commit message
of that commit is a bit parsimonious with information), it calls
`malloc()` manually and then `memcpy()`s the bits and pieces into place.
It allocates the size of `struct untracked_cache_dir` plus the string
length of the untracked file name, then copies the information in two
steps: first the fixed-size metadata, then the name. And here lies the
rub: it includes the trailing NUL byte in the name.
If `FLEX_ARRAY` is defined as 0, this results in a buffer overrun.
To fix this, let's just add 1, for the trailing NUL byte. Technically,
this overallocates on platforms where `FLEX_ARRAY` is 1, but it should
not matter much in reality, as `malloc()` usually overallocates anyway,
unless the size to allocate aligns exactly with some internal chunk size
(see below for more on that).
The real strange thing is that neither valgrind nor DrMemory catches
this bug. In this developer's tests, a `memcpy()` (but not a
`memset()`!) could write up to 4 bytes after the allocated memory range
before valgrind would start reporting an issue.
However, when running Git built with nedmalloc as allocator, under rare
conditions (and inconsistently at that), this bug triggered an `abort()`
because nedmalloc rounds up the size to be `malloc()`ed to a multiple of
a certain chunk size, then adds a few bytes to be used for storing some
internal state. If there is no rounding up to do (because the size is
already a multiple of that chunk size), and if the buffer is overrun as
in the code patched in this commit, the internal state is corrupted.
The scenario that triggered this here bug fix entailed a git.git
checkout with an extra copy of the source code in an untracked
subdirectory, meaning that there was an untracked subdirectory called
"thunderbird-patch-inline" whose name's length is exactly 24 bytes,
which, added to the size of above-mentioned `struct untracked_cache_dir`
that weighs in with 104 bytes on a 64-bit system, amounts to 128,
aligning perfectly with nedmalloc's chunk size.
As there is no obvious way to trigger this bug reliably, on all
platforms supported by Git, and as the bug is obvious enough, this patch
comes without a regression test.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches git-submodule the set-branch subcommand which allows the
branch of a submodule to be set through a porcelain command without
having to manually manipulate the .gitmodules file.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>