Commit Graph

63859 Commits

Author SHA1 Message Date
René Scharfe
fe7fe62d8d grep: report missing left operand of --and
Git grep allows combining two patterns with --and.  It checks and
reports if the second pattern is missing when compiling the expression.
A missing first pattern, however, is only reported later at match time.
Thus no error is returned if no matching is done, e.g. because no file
matches the also given pathspec.

When that happens we get an expression tree with an GREP_NODE_AND node
and a NULL pointer to the missing left child.  free_pattern_expr()
tries to dereference it during the cleanup at the end, which results
in a segmentation fault.

Fix this by verifying the presence of the left operand at expression
compilation time.

Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-30 14:19:03 -07:00
Eric Wong
dc05929411 xmmap: inform Linux users of tuning knobs on ENOMEM
Linux users may benefit from additional information on how to
avoid ENOMEM from mmap despite the system having enough RAM to
accomodate them.  We can't reliably unmap pack windows to work
around the issue since malloc and other library routines may
mmap without our knowledge.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 23:14:25 -07:00
Ævar Arnfjörð Bjarmason
c49a177bec test-lib.sh: set COLUMNS=80 for --verbose repeatability
Some tests will fail under --verbose because while we've unset COLUMNS
since b1d645b58a (tests: unset COLUMNS inherited from environment,
2012-03-27), we also look for the columns with an ioctl(..,
TIOCGWINSZ, ...) on some platforms. By setting COLUMNS again we
preempt the TIOCGWINSZ lookup in pager.c's term_columns(), it'll take
COLUMNS over TIOCGWINSZ,

This fixes t0500-progress-display.sh., which broke because of a
combination of the this issue and the progress output reacting to the
column width since 545dc345eb (progress: break too long progress bar
lines, 2019-04-12). The t5324-split-commit-graph.sh fails in a similar
manner due to progress output, see [1] for details.

The issue is not specific to progress.c, the diff code also checks
COLUMNS and some of its tests can be made to fail in a similar
manner[2], anything that invokes a pager is potentially affected.

See ea77e675e5 (Make "git help" react to window size correctly,
2005-12-18) and ad6c3739a3 (pager: find out the terminal width before
spawning the pager, 2012-02-12) for how the TIOCGWINSZ code ended up
in pager.c

1. http://lore.kernel.org/git/20210624051253.GG6312@szeder.dev
2. https://lore.kernel.org/git/20210627074419.GH6312@szeder.dev/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 13:06:30 -07:00
Ævar Arnfjörð Bjarmason
7b76d6bf22 Makefile: add and use the ".DELETE_ON_ERROR" flag
Use the GNU make ".DELETE_ON_ERROR" flag in our main Makefile, as we
already do in the Documentation/Makefile since db10fc6c09 (doc:
simplify Makefile using .DELETE_ON_ERROR, 2021-05-21).

Now if a command to make X fails X will be removed, the default
behavior of GNU make is to only do so if "make" itself is interrupted
with a signal.

E.g. if we now intentionally break one of the rules with:

    -       mv $@+ $@
    +       mv $@+ $@ && \
    +       false

We'll get output like:

    $ make git
        CC git.o
        LINK git
    make: *** [Makefile:2179: git] Error 1
    make: *** Deleting file 'git'
    $ file git
    git: cannot open `git' (No such file or directory)

Before this change we'd leave the file in place in under this
scenario.

As in db10fc6c09 this allows us to remove patterns of removing
leftover $@ files at the start of rules, since previous failing runs
of the Makefile won't have left those littered around anymore.

I'm not as confident that we should be replacing the "mv $@+ $@"
pattern entirely, since that means that external programs or one of
our other Makefiles might race and get partial content.

I'm not changing $(REMOTE_CURL_ALIASES) since that uses a ln/ln -s/cp
dance, and would require the addition of "-f" flags if the "rm" at the
start was removed. I've also got plans to fix that ln/ln -s/cp pattern
in another series.

For $(LIB_FILE) and $(XDIFF_LIB) we can rely on the "c" (create) being
present in ARFLAGS.

I'm not changing "$(ETAGS_TARGET)", "tags" and "cscope" because
they've got a messy combination of removing "$@+" not "$@" at the
beginning, or "$@*". I'm also addressing those in another series.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29 08:03:45 -07:00
Taylor Blau
f89ecf7988 midx: report checksum mismatches during 'verify'
'git multi-pack-index verify' inspects the data in an existing MIDX for
correctness by checking that the recorded object offsets are correct,
and so on.

But it does not check that the file's trailing checksum matches the data
that it records. So, if an on-disk corruption happened to occur in the
final few bytes (and all other data was recorded correctly), we would:

  - get a clean result from 'git multi-pack-index verify', but
  - be unable to reuse the existing MIDX when writing a new one (since
    we now check for checksum mismatches before reusing a MIDX)

Teach the 'verify' sub-command to recognize corruption in the checksum
by calling midx_checksum_valid().

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
Taylor Blau
ec1e28ef9c midx: don't reuse corrupt MIDXs when writing
When writing a new multi-pack index, Git tries to reuse as much of the
data from an existing MIDX as possible, like object offsets. This is
done to avoid re-opening a bunch of *.idx files unnecessarily, but can
lead to problems if the data we are reusing is corrupt.

That's because we'll blindly reuse data from an existing MIDX without
checking its trailing checksum for validity. So if there is memory
corruption while writing a MIDX, or disk corruption in the intervening
period between writing and reuse, we'll blindly propagate those bad
values forward.

Suppose we experience a memory corruption while writing a MIDX such that
we write an incorrect object offset (or alternatively, the disk corrupts
the data after being written, but before being reused). Then when we go
to write a new MIDX, we'll reuse the bad object offset without checking
its validity. This means that the MIDX we just wrote is broken, but its
trailing checksum is in-tact, since we never bothered to look at the
values before writing.

In the above, a "git multi-pack-index verify" would have caught the
problem before writing, but writing a new MIDX wouldn't have noticed
anything wrong, blindly carrying forward the corrupt offset.

Individual pack indexes check their validity by verifying the crc32
attached to each entry when carrying data forward during a repack.
We could solve this problem for MIDXs in the same way, but individual
crc32's don't make much sense, since their entries are so small.
Likewise, checking the whole file on every read may be prohibitively
expensive if a repository has a lot of objects, packs, or both.

But we can check the trailing checksum when reusing an existing MIDX
when writing a new one. And a corrupt MIDX need not stop us from writing
a new one, since we can just avoid reusing the existing one at all and
pretend as if we are writing a new MIDX from scratch.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
Taylor Blau
15316a4732 commit-graph: rewrite to use checksum_valid()
Rewrite an existing caller in `git commit-graph verify` to take
advantage of checksum_valid().

Note that the replacement isn't a verbatim cut-and-paste, since the new
function avoids using hashfile at all and instead talks to the_hash_algo
directly, but it is functionally equivalent.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
Taylor Blau
f9221e2cf5 csum-file: introduce checksum_valid()
Introduce a new function which checks the validity of a file's trailing
checksum. This is similar to hashfd_check(), but different since it is
intended to be used by callers who aren't writing the same data (like
`git index-pack --verify`), but who instead want to validate the
integrity of data that they are reading.

Rewrite the first of two callers which could benefit from this new
function in pack-check.c. Subsequent callers will be added in the
following patches.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:36:17 -07:00
Johannes Schindelin
e9f79acb28 ci: upgrade to using actions/{up,down}load-artifacts v2
The GitHub Actions to upload/download workflow artifacts saw a major
upgrade since Git's GitHub workflow was established. Let's use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:35:40 -07:00
Johannes Schindelin
abb2b389f7 ci (vs-build): use cmd to copy the DLLs, not powershell
We use a `.bat` script to copy the DLLs in the `vs-build` job, and those
type of scripts are native to CMD, not to PowerShell.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:35:39 -07:00
Johannes Schindelin
0eb6c189a3 ci: use the new GitHub Action to download git-sdk-64-minimal
In our continuous builds, Windows is the odd cookie that requires a
complete development environment to be downloaded because there is no
suitable one installed by default on Windows.

Side note: technically, there _is_ a development environment present in
GitHub Actions' build agents: MSYS2. But it differs from Git for
Windows' SDK in subtle points, unfortunately enough so to prevent Git's
test suite from running without failures.

Traditionally, we support downloading this environment (which we
nicknamed `git-sdk-64-minimal`) via a PowerShell scriptlet that accesses
the build artifacts of a dedicated Azure Pipeline (which packages a tiny
subset of the full Git for Windows SDK, containing just enough to build
Git and run its test suite).

This PowerShell script is unfortunately not very robust and sometimes
fails due to network issues.

Of course, we could add code to detect that situation, wait a little,
try again, if it fails again wait a little longer, lather, rinse and
repeat.

Instead of doing all of this in Git's own `.github/workflows/`, though,
let's offload this logic to the new GitHub Action at
https://github.com/marketplace/actions/setup-git-for-windows-sdk

This Action not only downloads and extracts git-sdk-64-minimal _outside_
the worktree (making it no longer necessary to meddle with
`.gitignore` or `.git/info/exclude`), it also adds the `bash.exe` to the
`PATH` and sets the environment variable `MSYSTEM` (an implementation
detail that Git's workflow should never have needed to know about).

This allows us to convert all those funny PowerShell tasks that wanted
to call git-sdk-64-minimal's `bash.exe`: they all are now regular `bash`
scriptlets.

This finally lets us get rid of the funny quoting and escaping where we
had to pay attention not only to quote and escape the Bash scriptlets
properly, but also to add a second level of escaping (with backslashes
for double quotes and backticks for dollar signs) to stop PowerShell
from doing unintended things.

Further, this Action uses a fast caching strategy native to GitHub
Actions that should accelerate the download across CI runs:
git-sdk-64-minimal is usually updated once per 24h, and needs to be
cached only once within that period. Caching it (unfortunately only on
a per-branch basis) speeds up the download step, and makes it much more
robust at the same time by virtue of accessing a cache location that is
closer in the network topology.

With this we can drop the home-rolled caching where we try to accelerate
the test phase by uploading git-sdk-64-minimal as a workflow artifact
after using it to build Git, and then download it as workflow artifact
in the test phase.

Even better: the `vs-test` job no longer needs to depend on the
`windows-build` job. The only reason it depended on it was to ensure
that the `git-sdk-64-minimal` workflow artifact was available.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:35:39 -07:00
Jeff King
6afb265b96 add_ref_decoration(): rename s/type/deco_type/
Now that we have two types (a decoration type and an object type) in the
function, let's give them both unique names to avoid accidentally using
one instead of the other.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:32:32 -07:00
Jeff King
88473c8bae load_ref_decorations(): avoid parsing non-tag objects
When we load the ref decorations, we parse the object pointed to by each
ref in order to get a "struct object". This is unnecessarily expensive;
we really only need the object struct, and don't even look at the parsed
contents. The exception is tags, which we do need to peel.

We can improve this by looking up the object type first (which is much
cheaper), and skipping the parse entirely for non-tags. This increases
the work slightly for annotated tags (which now do a type lookup _and_ a
parse), but decreases it a lot for other types. On balance, this seems
to be a good tradeoff.

In my git.git clone, with ~2k refs, most of which are branches, the time
to run "git log -1 --decorate" drops from 34ms to 11ms. Even on my
linux.git clone, which contains mostly tags and only a handful of
branches, the time drops from 30ms to 19ms. And on a more extreme
real-world case with ~220k refs, mostly non-tags, the time drops from
2.6s to 650ms.

That command is a lop-sided example, of course, because it does as
little non-loading work as possible. But it does show the absolute time
improvement. Even in something like a full "git log --decorate" on that
extreme repo, we'd still be saving 2s of CPU time.

Ideally we could push this even further, and avoid parsing even tags, by
relying on the packed-refs "peel" optimization (which we could do by
calling peel_iterated_oid() instead of peeling manually). But we can't
do that here. The packed-refs file only stores the bottom-layer of the
peel (so in a "tag->tag->commit" chain, it stores only the commit as the
peel result).  But the decoration code wants to peel the layers
individually, annotating the middle layers of the chain.

If the packed-refs file ever learns to store all of the peeled layers,
then we could switch to it. Or even if it stored a flag to indicate the
peel was not multi-layer (because most of them aren't), then we could
use it most of the time and fall back to a manual peel for the rare
cases.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:31:40 -07:00
Jeff King
7463064b28 object.h: add lookup_object_by_type() function
In some cases it's useful for efficiency reasons to get the type of an
object before deciding whether to parse it, but we still want an object
struct. E.g., in reachable.c, bitmaps give us the type, but we just want
to mark flags on each object. Likewise, we may loop over every object
and only parse tags in order to peel them; checking the type first lets
us avoid parsing the non-tags.

But our lookup_blob(), etc, functions make getting an object struct
annoying: we have to call the right function for every type. And we
cannot just use the generic lookup_object(), because it only returns an
already-seen object; it won't allocate a new object struct.

Let's provide a function that dispatches to the correct lookup_*
function based on a run-time type. In fact, reachable.c already has such
a helper, so we'll just make that public.

I did change the return type from "void *" to "struct object *". While
the former is a clever way to avoid casting inside the function, it's
less safe and less informative to people reading the function
declaration.

The next commit will add a new caller.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:18 -07:00
Jeff King
542d6abbb4 object.h: expand docstring for lookup_unknown_object()
The lookup_unknown_object() system is not often used and is somewhat
confusing. Let's try to explain it a bit more (which is especially
important as I'm adding a related but slightly different function in the
next commit).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:17 -07:00
Jeff King
b2086b5183 log: avoid loading decorations for userformats that don't need it
If no --decorate option is given, we default to auto-decoration. And
when that kicks in, cmd_log_init_finish() will unconditionally load the
decoration refs.

However, if we are using a user-format that does not include "%d" or
"%D", we won't show the decorations at all, so we don't need to load
them. We can detect this case and auto-disable them by adding a new
field to our userformat_want helper. We can do this even when the user
explicitly asked for --decorate, because it can't affect the output at
all.

This patch consistently reduces the time to run "git log -1 --format=%H"
on my git.git clone (with ~2k refs) from 34ms to 7ms. On a much more
extreme real-world repository (with ~220k refs), it goes from 2.5s to
4ms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:17 -07:00
Jeff King
3c7e2e8f0a pretty.h: update and expand docstring for userformat_find_requirements()
The comment only mentions "notes", but there are more fields now (and
I'm about to add another). Let's make it more general, and stick the
struct next to the function to make the list of possibilities obvious.

While we're touching this comment, let's also mention the behavior of
NULL, which some callers rely on (though in the long run, this global is
pretty nasty and probably should get moved into rev_info).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:17 -07:00
Kaartic Sivaraam
1cf823d8f0 submodule: remove unnecessary prefix based option logic
Over time when parts of submodule have been ported from shell to
builtin, many instances of the submodule helper have been added.
Also added with them are some unnecessary option passing
logic that are based on the `prefix` shell variable which never
gets set in their code flows.

On analysis, the only shell functions which have a valid usage
for the `prefix` shell variable are:

    - cmd_update: which is the only function which sets the variable
      and thus uses it properly

    - cmd_init: which uses the variable via a call from cmd_update

So, remove the unnecessary option parsing logic based on the `prefix`
shell variable.

Signed-off-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:28:19 -07:00
Ævar Arnfjörð Bjarmason
fb20d4b126 pack-objects tests: cover blindspots in stdin handling
Cover blindspots in the testing of stdin handling, including the
"!len" condition added in b5d97e6b0a (pack-objects: run rev-list
equivalent internally., 2006-09-04). The codepath taken with --revs
and read_object_list_from_stdin() acts differently in some of these
common cases, let's test for those.

The "--stdin --revs" test being added here stresses the combination of
--stdin-packs and the revision.c --stdin argument, some of this was
covered in a test added in 339bce27f4 (builtin/pack-objects.c: add
'--stdin-packs' option, 2021-02-22), but let's make sure that
GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=true keeps erroring out about
--stdin, and it isn't picked up by the revision.c API's handling of
that option.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:27:27 -07:00
Johannes Schindelin
e355307692 config: normalize the path of the system gitconfig
Git for Windows is compiled with a runtime prefix, and that runtime
prefix is typically `C:/Program Files/Git/mingw64`. As we want the
system gitconfig to live in the sibling directory `etc`, we define the
relative path as `../etc/gitconfig`.

However, as reported by Philip Oakley, the output of `git config
--show-origin --system -l` looks rather ugly, as it shows the path as
`file:C:/Program Files/Git/mingw64/../etc/gitconfig`, i.e. with the
`mingw64/../` part.

By normalizing the path, we get a prettier path.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:11:51 -07:00
Dennis Ameling
50101b93ca cmake(windows): set correct path to the system Git config
Currently, when Git for Windows is built with CMake, the system Git config is
expected in a different location than when building via `make`: the former
expects it to be in `<runtime-prefix>/mingw64/etc/gitconfig`, the latter in
`<runtime-prefix>/etc/gitconfig`.

Because of this, things like `git clone` do not work correctly (because cURL is
no longer able to find its certificate bundle that it needs to validate HTTPS
certificates). See the full bug report and discussion here:
https://github.com/git-for-windows/git/issues/3071#issuecomment-789261386.

This commit aligns the CMake-based build by mimicking what is already done in
`config.mak.uname`.

This closes https://github.com/git-for-windows/git/issues/3071.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:11:50 -07:00
Johannes Schindelin
fb5e3378f8 mingw: move Git for Windows' system config where users expect it
Git for Windows' prefix is `/mingw64/` (or `/mingw32/` for 32-bit
versions), therefore the system config is located at the clunky location
`C:\Program Files\Git\mingw64\etc\gitconfig`.

This moves the system config into a more logical location: the `mingw64`
part of `C:\Program Files\Git\mingw64\etc\gitconfig` never made sense,
as it is a mere implementation detail. Let's skip the `mingw64` part and
move this to `C:\Program Files\Git\etc\gitconfig`.

Side note: in the rare (and not recommended) case a user chooses to
install 32-bit Git for Windows on a 64-bit system, the path will of
course be `C:\Program Files (x86)\Git\etc\gitconfig`.

Background: During the Git for Windows v1.x days, the system config was
located at `C:\Program Files (x86)\Git\etc\gitconfig`. With Git for
Windows v2.x, it moved to `C:\Program Files\Git\mingw64\gitconfig` (or
`C:\Program Files (x86)\Git\mingw32\gitconfig`). Rather than fixing it
back then, we tried to introduce a "Windows-wide" config, but that never
caught on.

Likewise, we move the system `gitattributes` into the same directory.

Obviously, we are cautious to do this only for the known install
locations `/mingw64` and `/mingw32`; If anybody wants to override that
while building their version of Git (e.g. via `make prefix=$HOME`), we
leave the default location of the system config and gitattributes alone.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:11:48 -07:00
Carlo Marcelo Arenas Belón
ebbf5d2b70 config.mak.uname: PCRE1 cleanup
Style issue: a space was missing.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:10:47 -07:00
Beshr Kayali
9fffc38583 Documentation: fix typo in the --patch option of the commit command
Typofix (chose -> choose) in the documentation of the patch option
under the commit command.

Signed-off-by: Beshr Kayali <me@beshr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 13:08:25 -07:00
Johannes Schindelin
9b6e2c8b98 pager: avoid setting COLUMNS when we're guessing its value
We query `TIOCGWINSZ` in Git to determine the correct value for
`COLUMNS`, and then set that environment variable.

If `TIOCGWINSZ` is not available, we fall back to the hard-coded value
80 _and still_ set the environment variable.

On Windows this is a problem. The reason is that Git for
Windows uses a version of `less` that relies on the MSYS2 runtime to
interact with the pseudo terminal (typically inside a MinTTY window,
which is also aware of the MSYS2 runtime). Both MinTTY and `less.exe`
interact with that pseudo terminal via `ioctl()` calls (which the MSYS2
runtime emulates even if there is no such thing on Windows).
Since https://github.com/gwsw/less/commit/bb0ee4e76c2, `less` prefers
the `COLUMNS` variable over asking ncurses itself.

But `git.exe` itself is _not_ aware of the MSYS2 runtime, or for that
matter of that pseudo terminal, and has no way to call `ioctl()` or
`TIOCGWINSZ`.

Therefore, `git.exe` will fall back to hard-coding 80 columns, no matter
what the actual terminal size is.

But `less.exe` is totally able to interact with the MSYS2 runtime and
would not actually require Git's help (which actually makes things
worse here). So let's not override `COLUMNS` on Windows.

Let's just not set `COLUMNS` unless we managed to query the actual value
from the terminal.

This fixes https://github.com/git-for-windows/git/issues/3235

Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:42:10 -07:00
Andrei Rybak
98c7656a18 git-worktree.txt: fix typo in example path
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:05:15 -07:00
Andrei Rybak
6fc5369263 t: fix typos in test messages
Both in t4258 and in t9001, the code of the tests following shows the
proper name for the configuration variables.  So use the correct names
in the test messages as well.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:05:14 -07:00
Andrei Rybak
3fca954172 blame: correct name of config option in docs
As can be seen in files "Documentation/blame-options.txt" and
"builtin/blame.c", the name of this configuration option is
"blame.markUnblamableLines".

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 10:05:13 -07:00
Jonathan Tan
ef830cc434 promisor-remote: teach lazy-fetch in any repo
This is one step towards supporting partial clone submodules.

Even after this patch, we will still lack partial clone submodules
support, primarily because a lot of Git code that accesses submodule
objects does so by adding their object stores as alternates, meaning
that any lazy fetches that would occur in the submodule would be done
based on the config of the superproject, not of the submodule. This also
prevents testing of the functionality in this patch by user-facing
commands. So for now, test this mechanism using a test helper.

Besides that, there is some code that uses the wrapper functions
like has_promisor_remote(). Those will need to be checked to see if they
could support the non-wrapper functions instead (and thus support any
repository, not just the_repository).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:58:01 -07:00
Jonathan Tan
d1fa94356d run-command: refactor subprocess env preparation
submodule.c has functionality that prepares the environment for running
a subprocess in a new repo. The lazy-fetching code (used in partial
clones) will need this in a subsequent commit, so move it to a more
central location.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:58:01 -07:00
Jonathan Tan
69bb2e1804 submodule: refrain from filtering GIT_CONFIG_COUNT
14111fc492 ("git: submodule honor -c credential.* from command line",
2016-03-01) taught Git to pass through the GIT_CONFIG_PARAMETERS
environment variable when invoking a subprocess on behalf of a
submodule. But when d8d77153ea ("config: allow specifying config entries
via envvar pairs", 2021-01-15) introduced support for GIT_CONFIG_COUNT
(and its associated GIT_CONFIG_KEY_? and GIT_CONFIG_VALUE_?), the
subprocess mechanism wasn't updated to also pass through these
variables.

Since they are conceptually the same (d8d77153ea was written to address
a shortcoming of GIT_CONFIG_PARAMETERS), update the submodule subprocess
mechanism to also pass through GIT_CONFIG_COUNT.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:57:54 -07:00
Jonathan Tan
ef7dc2e9cc promisor-remote: support per-repository config
Instead of using global variables to store promisor remote information,
store this config in struct repository instead, and add
repository-agnostic non-static functions corresponding to the existing
non-static functions that only work on the_repository.

The actual lazy-fetching of missing objects currently does not work on
repositories other than the_repository, and will still not work after
this commit, so add a BUG message explaining this. A subsequent commit
will remove this limitation.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:57:42 -07:00
Jonathan Tan
ebaf3bcf1a repository: move global r_f_p_c to repo struct
Move repository_format_partial_clone, which is currently a global
variable, into struct repository. (Full support for per-repository
partial clone config will be done in a subsequent commit - this is split
into its own commit because of the extent of the changes needed.)

The new repo-specific variable cannot be set in
check_repository_format_gently() (as is currently), because that
function does not know which repo it is operating on (or even whether
the value is important); therefore this responsibility is delegated to
the outermost caller that knows. Of all the outermost callers that know
(found by looking at all functions that call clear_repository_format()),
I looked at those that either read from the main Git directory or write
into a struct repository. These callers have been modified accordingly
(write to the_repository in the former case and write to the given
struct repository in the latter case).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:57:41 -07:00
dorgon.chang
54662d5958 git-p4: fix failed submit by skip non-text data files
If the submit contain binary files, it will throw exception and stop submit when try to append diff line description.

This commit will skip non-text data files when exception UnicodeDecodeError thrown.

The skip will not affect actual submit files in the resulting cl,
the diff line description will only appear in submit template,
so you can review what changed before actully submit to p4.

I don't know if add any message here will be helpful for users,
so I choose to just skip binary content, since it already append filename previously.

Signed-off-by: dorgon.chang <dorgonman@hotmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:49:30 -07:00
Ævar Arnfjörð Bjarmason
d65aea37d9 show-branch tests: add missing tests
Add missing tests for --remotes, --list and --merge-base. These are
not exhaustive, but better than the nothing we have now.

There were some tests for this command added in f76412ed6d ([PATCH]
Add 'git show-branch'., 2005-08-21) has never been properly tested,
namely for the --all option in t6432-merge-recursive-space-options.sh,
and some of --merge-base and --independent in t6010-merge-base.sh.

This fixes a few more blind spots, but there's still a lot of behavior
that's not tested for.

These new tests show the odd (and possibly unintentional) behavior of
--merge-base with one argument, and how its output is the same as "git
merge-base" with N bases in this particular case. See the test added
in f621a8454d (git-merge-base/git-show-branch --merge-base:
Documentation and test, 2009-08-05) for a case where the two aren't
the same.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:33:06 -07:00
Ævar Arnfjörð Bjarmason
4465690cd8 show-branch: don't <COLOR></RESET> for space characters
Change the colored output introduced in ab07ba2a24 (show-branch: color
the commit status signs, 2009-04-22) to not color and reset each
individual space character we use for padding. The intent is to color
just the "!", "+" etc. characters.

This makes the output easier to test, so let's do that now. The test
would be much more verbose without a color/reset for each space
character. Since the coloring cycles through colors we previously had
a "rainbow of space characters".

In theory this breaks things for anyone who's relying on the exact
colored output of show-branch, in practice I'd think anyone parsing it
isn't actively turning on the colored output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:33:06 -07:00
Ævar Arnfjörð Bjarmason
2f61b3eef3 mktag tests: test fast-export
Pass the bad tags we've created in the mktag tests through
fast-export, it will die on the bad object or ref, let's make sure
that happens.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
Ævar Arnfjörð Bjarmason
b48015b340 mktag tests: test for-each-ref
Add a "for-each-ref" for all the mktag tests. This test would have
caught the segfault which was fixed in c685450880 (ref-filter: fix
NULL check for parse object failure, 2021-04-01). Let's make sure we
test that code more exhaustively.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
Ævar Arnfjörð Bjarmason
eddc1f556c mktag tests: test update-ref and reachable fsck
Extend the mktag tests to pass the created bad tag through update-ref
and fsck.

The reason for passing it through update-ref is to guard against it
having a segfault as for-each-ref did before c685450880 (ref-filter:
fix NULL check for parse object failure, 2021-04-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
Ævar Arnfjörð Bjarmason
47c0cb1a5d mktag tests: test hash-object --literally and unreachable fsck
Extend the mktag tests to pass the tag we've created through both
hash-object --literally and fsck.

This checks that fsck itself will not complain about certain invalid
content if a reachable tip isn't involved. Due to how fsck works and
walks the graph the failure will be different if the object is
reachable, so we might succeed before we've created the ref.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:30:41 -07:00
Elijah Newren
2bff554b23 merge-ort: add prefetching for content merges
Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-05)
introduced batching of fetching missing blobs, so that the diff
machinery would have one fetch subprocess grab N blobs instead of N
processes each grabbing 1.

However, the diff machinery is not the only thing in a merge that needs
to work on blobs.  The 3-way content merges need them as well.  Rather
than download all the blobs 1 at a time, prefetch all the blobs needed
for regular content merges.

This does not cover all possible paths in merge-ort that might need to
download blobs.  Others include:
  - The blob_unchanged() calls to avoid modify/delete conflicts (when
    blob renormalization results in an "unchanged" file)
  - Preliminary content merges needed for rename/add and
    rename/rename(2to1) style conflicts.  (Both of these types of
    conflicts can result in nested conflict markers from the need to do
    two levels of content merging; the first happens before our new
    prefetch_for_content_merges() function.)

The first of these wouldn't be an extreme amount of work to support, and
even the second could be theoretically supported in batching, but all of
these cases seem unusual to me, and this is a minor performance
optimization anyway; in the worst case we only get some of the fetches
batched and have a few additional one-off fetches.  So for now, just
handle the regular 3-way content merges in our prefetching.

For the testcase from the previous commit, the number of downloaded
objects remains at 63, but this drops the number of fetches needed from
32 down to 20, a sizeable reduction.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
Elijah Newren
1aedd03afb diffcore-rename: use a different prefetch for basename comparisons
merge-ort was designed to minimize the amount of data needed and used,
and several changes were made to diffcore-rename to take advantage of
extra metadata to enable this data minimization (particularly the
relevant_sources variable for skipping "irrelevant" renames).  This
effort obviously succeeded in drastically reducing computation times,
but should also theoretically allow partial clones to download much less
information.  Previously, though, the "prefetch" command used in
diffcore-rename had never been modified and downloaded many blobs that
were unnecessary for merge-ort.  This commit corrects that.

When doing basename comparisons, we want to fetch only the objects that
will be used for basename comparisons.  If after basename fetching this
leaves us with no more relevant sources (or no more destinations), then
we won't need to do the full inexact rename detection and can skip
downloading additional source and destination files.  Even if we have to
do that later full inexact rename detection, irrelevant sources are
culled after basename matching and before the full inexact rename
detection, so we can still avoid downloading the blobs for irrelevant
sources.  Rename prefetch() to inexact_prefetch(), and introduce a
new basename_prefetch() to take advantage of this.

If we modify the testcase from commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2021-01-23)
to pass
    --sparse --filter=blob:none
to the clone command, and use the new trace2 "fetch_count" output from
a few commits ago to track both the number of fetch subcommands invoked
and the number of objects fetched across all those fetches, then for
the mega-renames testcase we observe the following:

BEFORE this commit, rebasing 35 patches:
    strategy     # of fetches    total # of objects fetched
    ---------    ------------    --------------------------
    recursive    62              11423
    ort          30              11391

AFTER this commit, rebasing the same 35 patches:
    ort          32                 63

This means that the new code only needs to download less than 2 blobs
per patch being rebased.  That is especially interesting given that the
repository at the start only had approximately half a dozen TOTAL blobs
downloaded to start with (because the default sparse-checkout of just
the toplevel directory was in use).

So, for this particular linux kernel testcase that involved ~26,000
renames on the upstream side (drivers/ -> pilots/) across which 35
patches were being rebased, this change reduces the number of blobs that
need to be downloaded by a factor of ~180.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
Elijah Newren
d331dd3b0c diffcore-rename: allow different missing_object_cb functions
estimate_similarity() was setting up a diff_populate_filespec_options
every time it was called, requiring the caller of estimate_similarity()
to pass in some data needed to set up this option.  Currently the needed
data consisted of a single variable (skip_unmodified), but we want to
also have the different estimate_similarity() callsites start using
different missing_object_cb functions as well.  Rather than also passing
that data in, just have the caller pass in the whole
diff_populate_filespec_options, and reduce the number of times we need to
set it up.

As a side note, this also drops the number of calls to
has_promisor_remote() dramatically.  If L is the number of basename
paths to compare, M is the number of inexact sources, and N is the
number of inexact destinations, then the number of calls to
has_promisor_remote() drops from L+M*N down to at most 2 -- one for each
of the sites that calls estimate_similarity().  has_promisor_remote() is
a very fast function so this almost certainly has no measurable
performance impact, but it seems cleaner to avoid calling that function
so many times.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
Elijah Newren
c75c423952 t6421: add tests checking for excessive object downloads during merge
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 07:58:25 -07:00
Jiang Xin
9146ef75dd l10n: fixed tripple-letter typos
Andrei found that the word "shallow" has an extra letter "l" in
"po/zh_CN.po". There are similar typos in other l10n files.

Reported-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2021-06-21 09:46:18 +08:00
Felipe Contreras
a7d18a1109 pull: trivial whitespace style fix
Two spaces unaligned to anything is not part of the coding-style. A
single tab is.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:17 +09:00
Felipe Contreras
a751e0296f pull: trivial cleanup
There's no need to store ran_ff. Now it's obvious from the conditionals.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:17 +09:00
Felipe Contreras
340062243a pull: cleanup autostash check
Currently "git pull --rebase" takes a shortcut in the case a
fast-forward merge is possible; run_merge() is called with --ff-only.

However, "git merge" didn't have an --autostash option, so, when "git
pull --rebase --autostash" was called *and* the fast-forward merge
shortcut was taken, then the pull failed.

This was fixed in commit f15e7cf5cc (pull: ff --rebase --autostash
works in dirty repo, 2017-06-01) by simply skipping the fast-forward
merge shortcut.

Later on "git merge" learned the --autostash option [a03b55530a
(merge: teach --autostash option, 2020-04-07)], and so did "git pull"
[d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)].

Therefore it's not necessary to skip the fast-forward merge shortcut
anymore when called with --rebase --autostash.

Let's always take the fast-forward merge shortcut by essentially
reverting f15e7cf5cc.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:16 +09:00
Fabian Wermelinger
cea232194d completion: bash: fix late declaration of __git_cmd_idx
A recent update to contrib/completion/git-completion.bash causes bash to fail
auto complete custom commands that are wrapped with __git_func_wrap. Declaring
__git_cmd_idx=0 inside __git_func_wrap resolves the issue.

Signed-off-by: Fabian Wermelinger <fabianw@mavt.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 15:30:35 +09:00
Jeff King
7c0afdf23c t: use portable wrapper for readlink(1)
Not all systems have a readlink program available for use by the shell.
This causes t3210 to fail on at least AIX. Let's provide a perl
one-liner to do the same thing, and use it there.

I also updated calls in t9802. Nobody reported failure there, but it's
the same issue. Presumably nobody actually tests with p4 on AIX in the
first place (if it is even available there).

I left the use of readlink in the "--valgrind" setup in test-lib.sh, as
valgrind isn't available on exotic platforms anyway (and I didn't want
to increase dependencies between test-lib.sh and test-lib-functions.sh).

There's one other curious case. Commit d2addc3b96 (t7800: readlink may
not be available, 2016-05-31) fixed a similar case. We can't use our
wrapper function there, though, as it's inside a sub-script triggered by
Git. It uses a slightly different technique ("ls" piped to "sed"). I
chose not to use that here as it gives confusing "ls -l" output if the
file is unexpectedly not a symlink (which is OK for its limited use, but
potentially confusing for general use within the test suite). The perl
version emits the empty string.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 15:26:05 +09:00