* maint-2.30: (23 commits)
Git 2.30.9
gettext: avoid using gettext if the locale dir is not present
apply --reject: overwrite existing `.rej` symlink if it exists
http.c: clear the 'finished' member once we are done with it
clone.c: avoid "exceeds maximum object size" error with GCC v12.x
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
t5604: GETTEXT_POISON fix, conclusion
t5604: GETTEXT_POISON fix, part 1
t5619: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, conclusion
t0003: GETTEXT_POISON fix, part 1
t0033: GETTEXT_POISON fix
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
ci: install python on ubuntu
ci: use the same version of p4 on both Linux and macOS
ci: remove the pipe after "p4 -V" to catch errors
github-actions: run gcc-8 on ubuntu-20.04 image
...
Avoids issues with renaming or deleting sections with long lines, where
configuration values may be interpreted as sections, leading to
configuration injection. Addresses CVE-2023-29007.
* tb/config-copy-or-rename-in-file-injection:
config.c: disallow overly-long lines in `copy_or_rename_section_in_file()`
config.c: avoid integer truncation in `copy_or_rename_section_in_file()`
config: avoid fixed-sized buffer when renaming/deleting a section
t1300: demonstrate failure when renaming sections with long lines
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Avoids the overhead of calling `gettext` when initialization of the
translated messages was skipped. Addresses CVE-2023-25815.
* avoid-using-uninitialized-gettext: (1 commit)
gettext: avoid using gettext if the locale dir is not present
Address CVE-2023-25652 by deleting any existing `.rej` symbolic links
instead of following them.
* js/apply-overwrite-rej-symlink-if-exists:
apply --reject: overwrite existing `.rej` symlink if it exists
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
As a defense-in-depth measure to guard against any potentially-unknown
buffer overflows in `copy_or_rename_section_in_file()`, refuse to work
with overly-long lines in a gitconfig.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
There are a couple of spots within `copy_or_rename_section_in_file()`
that incorrectly use an `int` to track an offset within a string, which
may truncate or wrap around to a negative value.
Historically it was impossible to have a line longer than 1024 bytes
anyway, since we used fgets() with a fixed-size buffer of exactly that
length. But the recent change to use a strbuf permits us to read lines
of arbitrary length, so it's possible for a malicious input to cause us
to overflow past INT_MAX and do an out-of-bounds array read.
Practically speaking, however, this should never happen, since it
requires 2GB section names or values, which are unrealistic in
non-malicious circumstances.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When renaming (or deleting) a section of configuration, Git uses the
function `git_config_copy_or_rename_section_in_file()` to rewrite the
configuration file after applying the rename or deletion to the given
section.
To do this, Git repeatedly calls `fgets()` to read the existing
configuration data into a fixed size buffer.
When the configuration value under `old_name` exceeds the size of the
buffer, we will call `fgets()` an additional time even if there is no
newline in the configuration file, since our read length is capped at
`sizeof(buf)`.
If the first character of the buffer (after zero or more characters
satisfying `isspace()`) is a '[', Git will incorrectly treat it as
beginning a new section when the original section is being removed. In
other words, a configuration value satisfying this criteria can
incorrectly be considered as a new secftion instead of a variable in the
original section.
Avoid this issue by using a variable-width buffer in the form of a
strbuf rather than a fixed-with region on the stack. A couple of small
points worth noting:
- Using a strbuf will cause us to allocate arbitrary sizes to match
the length of each line. In practice, we don't expect any
reasonable configuration files to have lines that long, and a
bandaid will be introduced in a later patch to ensure that this is
the case.
- We are using strbuf_getwholeline() here instead of strbuf_getline()
in order to match `fgets()`'s behavior of leaving the trailing LF
character on the buffer (as well as a trailing NUL).
This could be changed later, but using strbuf_getwholeline() changes
the least about this function's implementation, so it is picked as
the safest path.
- It is temping to want to replace the loop to skip over characters
matching isspace() at the beginning of the buffer with a convenience
function like `strbuf_ltrim()`. But this is the wrong approach for a
couple of reasons:
First, it involves a potentially large and expensive `memmove()`
which we would like to avoid. Second, and more importantly, we also
*do* want to preserve those spaces to avoid changing the output of
other sections.
In all, this patch is a minimal replacement of the fixed-width buffer in
`git_config_copy_or_rename_section_in_file()` to instead use a `struct
strbuf`.
Reported-by: André Baptista <andre@ethiack.com>
Reported-by: Vítor Pinho <vitor@ethiack.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In cc5e1bf992 (gettext: avoid initialization if the locale dir is not
present, 2018-04-21) Git was taught to avoid a costly gettext start-up
when there are not even any localized messages to work with.
But we still called `gettext()` and `ngettext()` functions.
Which caused a problem in Git for Windows when the libgettext that is
consumed from the MSYS2 project stopped using a runtime prefix in
https://github.com/msys2/MINGW-packages/pull/10461
Due to that change, we now use an unintialized gettext machinery that
might get auto-initialized _using an unintended locale directory_:
`C:\mingw64\share\locale`.
Let's record the fact when the gettext initialization was skipped, and
skip calling the gettext functions accordingly.
This addresses CVE-2023-25815.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When renaming a configuration section which has an entry whose length
exceeds the size of our buffer in config.c's implementation of
`git_config_copy_or_rename_section_in_file()`, Git will incorrectly
form a new configuration section with part of the data in the section
being removed.
In this instance, our first configuration file looks something like:
[b]
c = d <spaces> [a] e = f
[a]
g = h
Here, we have two configuration values, "b.c", and "a.g". The value "[a]
e = f" belongs to the configuration value "b.c", and does not form its
own section.
However, when renaming the section 'a' to 'xyz', Git will write back
"[xyz]\ne = f", but "[xyz]" is still attached to the value of "b.c",
which is why "e = f" on its own line becomes a new entry called "b.e".
A slightly different example embeds the section being renamed within
another section.
Demonstrate this failure in a test in t1300, which we will fix in the
following commit.
Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The `git apply --reject` is expected to write out `.rej` files in case
one or more hunks fail to apply cleanly. Historically, the command
overwrites any existing `.rej` files. The idea being that
apply/reject/edit cycles are relatively common, and the generated `.rej`
files are not considered precious.
But the command does not overwrite existing `.rej` symbolic links, and
instead follows them. This is unsafe because the same patch could
potentially create such a symbolic link and point at arbitrary paths
outside the current worktree, and `git apply` would write the contents
of the `.rej` file into that location.
Therefore, let's make sure that any existing `.rej` file or symbolic
link is removed before writing it.
Reported-by: RyotaK <ryotak.mail@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `maint-2.30` branch accumulated quite a few fixes over the past two
years. Most of those fixes were originally based on newer versions, and
while the patches cherry-picked cleanly, we weren't diligent enough to
pay attention to the CI builds and the GETTEXT_POISON job regressed.
This topic branch fixes that.
* js/gettext-poison-fixes
t0033: GETTEXT_POISON fix
t0003: GETTEXT_POISON fix, part 1
t0003: GETTEXT_POISON fix, conclusion
t5619: GETTEXT_POISON fix
t5604: GETTEXT_POISON fix, part 1
t5604: GETTEXT_POISON fix, conclusion
Update the version of Ubuntu used for GitHub Actions CI from 18.04
to 22.04.
* ds/github-actions-use-newer-ubuntu:
ci: update 'static-analysis' to Ubuntu 22.04
GitHub Actions scheduled a brownout of Ubuntu 18.04, which canceled all
runs of the 'static-analysis' job in our CI runs. Update to 22.04 to
avoid this as the brownout later turns into a complete deprecation.
The use of 18.04 was set in d051ed77ee (.github/workflows/main.yml: run
static-analysis on bionic, 2021-02-08) due to the lack of Coccinelle
being available on 20.04 (which continues today).
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git range-diff" code clean-up. Needed to pacify modern GCC versions.
* jk/range-diff-fixes:
range-diff: use ssize_t for parsed "len" in read_patches()
range-diff: handle unterminated lines in read_patches()
range-diff: drop useless "offset" variable from read_patches()
Deal with a few deprecation warning from cURL library.
* jk/curl-avoid-deprecated-api:
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
Adjust the GitHub CI to newer ubuntu release.
* jx/ci-ubuntu-fix:
github-actions: run gcc-8 on ubuntu-20.04 image
ci: install python on ubuntu
ci: use the same version of p4 on both Linux and macOS
ci: remove the pipe after "p4 -V" to catch errors
Meant to go with js/ci-gcc-12-fixes.
source: <xmqq7d68ytj8.fsf_-_@gitster.g>
* jc/http-clear-finished-pointer:
http.c: clear the 'finished' member once we are done with it
Fixes real problems noticed by gcc 12 and works around false
positives.
* js/ci-gcc-12-fixes:
nedmalloc: avoid new compile error
compat/win32/syslog: fix use-after-realloc
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.
Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request). The loop terminating condition mistakenly
thought that the original request has yet to be completed.
Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed. It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.
One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns. Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Technically, the pointer difference `end - start` _could_ be negative,
and when cast to an (unsigned) `size_t` that would cause problems. In
this instance, the symptom is:
dir.c: In function 'git_url_basename':
dir.c:3087:13: error: 'memchr' specified bound [9223372036854775808, 0]
exceeds maximum object size 9223372036854775807
[-Werror=stringop-overread]
CC ewah/bitmap.o
3087 | if (memchr(start, '/', end - start) == NULL
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
While it is a bit far-fetched to think that `end` (which is defined as
`repo + strlen(repo)`) and `start` (which starts at `repo` and never
steps beyond the NUL terminator) could result in such a negative
difference, GCC has no way of knowing that.
See also https://gcc.gnu.org/bugzilla//show_bug.cgi?id=85783.
Let's just add a safety check, primarily for GCC's benefit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In fade728df1 (apply: fix writing behind newly created symbolic links,
2023-02-02), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In bffc762f87 (dir-iterator: prevent top-level symlinks without
FOLLOW_SYMLINKS, 2023-01-24), we backported a patch onto v2.30.* that
was originally based on a much newer version. The v2.30.* release train
still has the GETTEXT_POISON CI job, though, and hence needs
`test_i18n*` in its tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In cf8f6ce02a (clone: delay picking a transport until after
get_repo_path(), 2023-01-24), we backported a patch onto v2.30.* that
was originally based on a much newer version. The v2.30.* release train
still has the GETTEXT_POISON CI job, though, and hence needs
`test_i18n*` in its tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As we iterate through the buffer containing git-log output, parsing
lines, we use an "int" to store the size of an individual line. This
should be a size_t, as we have no guarantee that there is not a
malicious 2GB+ commit-message line in the output.
Overflowing this integer probably doesn't do anything _too_ terrible. We
are not using the value to size a buffer, so the worst case is probably
an out-of-bounds read from before the array. But it's easy enough to
fix.
Note that we have to use ssize_t here, since we also store the length
result from parse_git_diff_header(), which may return a negative value
for error. That function actually returns an int itself, which has a
similar overflow problem, but I'll leave that for another day. Much
of the apply.c code uses ints and should be converted as a whole; in the
meantime, a negative return from parse_git_diff_header() will be
interpreted as an error, and we'll bail (so we can't handle such a case,
but given that it's likely to be malicious anyway, the important thing
is we don't have any memory errors).
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 3c50032ff5 (attr: ignore overly large gitattributes files,
2022-12-01), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When parsing our buffer of output from git-log, we have a
find_end_of_line() helper that finds the next newline, and gives us the
number of bytes to move past it, or the size of the whole remaining
buffer if there is no newline.
But trying to handle both those cases leads to some oddities:
- we try to overwrite the newline with NUL in the caller, by writing
over line[len-1]. This is at best redundant, since the helper will
already have done so if it saw a newline. But if it didn't see a
newline, it's actively wrong; we'll overwrite the byte at the end of
the (unterminated) line.
We could solve this just dropping the extra NUL assignment in the
caller and just letting the helper do the right thing. But...
- if we see a "diff --git" line, we'll restore the newline on top of
the NUL byte, so we can pass the string to parse_git_diff_header().
But if there was no newline in the first place, we can't do this.
There's no place to put it (the current code writes a newline
over whatever byte we obliterated earlier). The best we can do is
feed the complete remainder of the buffer to the function (which is,
in fact, a string, by virtue of being a strbuf).
To solve this, the caller needs to know whether we actually found a
newline or not. We could modify find_end_of_line() to return that
information, but we can further observe that it has only one caller.
So let's just inline it in that caller.
Nobody seems to have noticed this case, probably because git-log would
never produce input that doesn't end with a newline. Arguably we could
just return an error as soon as we see that the output does not end in a
newline. But the code to do so actually ends up _longer_, mostly because
of the cleanup we have to do in handling the error.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In dfa6b32b5e (attr: ignore attribute lines exceeding 2048 bytes,
2022-12-01), we backported a patch onto v2.30.* that was originally
based on a much newer version. The v2.30.* release train still has the
GETTEXT_POISON CI job, though, and hence needs `test_i18n*` in its
tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In e47363e5a8 (t0033: add tests for safe.directory, 2022-04-13), we
backported a patch onto v2.30.* that was originally based on a much
newer version. The v2.30.* release train still has the GETTEXT_POISON
CI job, though, and hence needs `test_i18n*` in its tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Python is missing from the default ubuntu-22.04 runner image, which
prevents git-p4 from working. To install python on ubuntu, we need
to provide the correct package names:
* On Ubuntu 18.04 (bionic), "/usr/bin/python2" is provided by the
"python" package, and "/usr/bin/python3" is provided by the "python3"
package.
* On Ubuntu 20.04 (focal) and above, "/usr/bin/python2" is provided by
the "python2" package which has a different name from bionic, and
"/usr/bin/python3" is provided by "python3".
Since the "ubuntu-latest" runner image has a higher version, its
safe to use "python2" or "python3" package name.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "offset" variable was was introduced in 44b67cb62b (range-diff:
split lines manually, 2019-07-11), but it has never done anything
useful. We use it to count up the number of bytes we've consumed, but we
never look at the result. It was probably copied accidentally from an
almost-identical loop in apply.c:find_header() (and the point of that
commit was to make use of the parse_git_diff_header() function which
underlies both).
Because the variable was set but not used, most compilers didn't seem to
notice, but the upcoming clang-14 does complain about it, via its
-Wunused-but-set-variable warning.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The two options do exactly the same thing, but the latter has been
deprecated and in recent versions of curl may produce a compiler
warning. Since the UPLOAD form is available everywhere (it was
introduced in the year 2000 by curl 7.1), we can just switch to it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
GCC v12.x complains thusly:
compat/nedmalloc/nedmalloc.c: In function 'DestroyCaches':
compat/nedmalloc/nedmalloc.c:326:12: error: the comparison will always
evaluate as 'true' for the address of 'caches'
will never be NULL [-Werror=address]
326 | if(p->caches)
| ^
compat/nedmalloc/nedmalloc.c:196:22: note: 'caches' declared here
196 | threadcache *caches[THREADCACHEMAXCACHES];
| ^~~~~~
... and it is correct, of course.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There would be a segmentation fault when running p4 v16.2 on ubuntu
22.04 which is the latest version of ubuntu runner image for github
actions.
By checking each version from [1], p4d version 21.1 and above can work
properly on ubuntu 22.04. But version 22.x will break some p4 test
cases. So p4 version 21.x is exactly the version we can use.
With this update, the versions of p4 for Linux and macOS happen to be
the same. So we can add the version number directly into the "P4WHENCE"
variable, and reuse it in p4 installation for macOS.
By removing the "LINUX_P4_VERSION" variable from "ci/lib.sh", the
comment left above has nothing to do with p4, but still applies to
git-lfs. Since we have a fixed version of git-lfs installed on Linux,
we may have a different version on macOS.
[1]: https://cdist2.perforce.com/perforce/
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When installing p4 as a dependency, we used to pipe output of "p4 -V"
and "p4d -V" to validate the installation and output a condensed version
information. But this would hide potential errors of p4 and would stop
with an empty output. E.g.: p4d version 16.2 running on ubuntu 22.04
causes sigfaults, even before it produces any output.
By removing the pipe after "p4 -V" and "p4d -V", we may get a
verbose output, and stop immediately on errors because we have "set
-e" in "ci/lib.sh". Since we won't look at these trace logs unless
something fails, just including the raw output seems most sensible.
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GitHub starts to upgrade its runner image "ubuntu-latest" from version
"ubuntu-20.04" to version "ubuntu-22.04". It will fail to find and
install "gcc-8" package on the new runner image.
Change the runner image of the `linux-gcc` job from "ubuntu-latest" to
"ubuntu-20.04" in order to install "gcc-8" as a dependency.
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows' SDK recently upgraded to GCC v12.x which points out
that the `pos` variable might be used even after the corresponding
memory was `realloc()`ed and therefore potentially no longer valid.
Since a subset of this SDK is used in Git's CI/PR builds, we need to fix
this to continue to be able to benefit from the CI/PR runs.
Note: This bug has been with us since 2a6b149c64 (mingw: avoid using
strbuf in syslog, 2011-10-06), and while it looks tempting to replace
the hand-rolled string manipulation with a `strbuf`-based one, that
commit's message explains why we cannot do that: The `syslog()` function
is called as part of the function in `daemon.c` which is set as the
`die()` routine, and since `strbuf_grow()` can call that function if it
runs out of memory, this would cause a nasty infinite loop that we do
not want to re-introduce.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.30:
Git 2.30.8
apply: fix writing behind newly created symbolic links
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
Fix a vulnerability (CVE-2023-23946) that allows crafted input to trick
`git apply` into writing files outside of the working tree.
* ps/apply-beyond-symlink:
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Resolve a security vulnerability (CVE-2023-22490) where `clone_local()`
is used in conjunction with non-local transports, leading to arbitrary
path exfiltration.
* tb/clone-local-symlinks:
dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
clone: delay picking a transport until after get_repo_path()
t5619: demonstrate clone_local() with ambiguous transport
When writing files git-apply(1) initially makes sure that none of the
files it is about to create are behind a symlink:
```
$ git init repo
Initialized empty Git repository in /tmp/repo/.git/
$ cd repo/
$ ln -s dir symlink
$ git apply - <<EOF
diff --git a/symlink/file b/symlink/file
new file mode 100644
index 0000000..e69de29
EOF
error: affected file 'symlink/file' is beyond a symbolic link
```
This safety mechanism is crucial to ensure that we don't write outside
of the repository's working directory. It can be fooled though when the
patch that is being applied creates the symbolic link in the first
place, which can lead to writing files in arbitrary locations.
Fix this by checking whether the path we're about to create is
beyond a symlink or not. Tightening these checks like this should be
fine as we already have these precautions in Git as explained
above. Ideally, we should update the check we do up-front before
starting to reflect the computed changes to the working tree so that
we catch this case as well, but as part of embargoed security work,
adding an equivalent check just before we try to write out a file
should serve us well as a reasonable first step.
Digging back into history shows that this vulnerability has existed
since at least Git v2.9.0. As Git v2.8.0 and older don't build on my
system anymore I cannot tell whether older versions are affected, as
well.
Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the dir_iterator API, we first stat(2) the base path, and
then use that as a starting point to enumerate the directory's contents.
If the directory contains symbolic links, we will immediately die() upon
encountering them without the `FOLLOW_SYMLINKS` flag. The same is not
true when resolving the top-level directory, though.
As explained in a previous commit, this oversight in 6f054f9fb3
(builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28)
can be used as an attack vector to include arbitrary files on a victim's
filesystem from outside of the repository.
Prevent resolving top-level symlinks unless the FOLLOW_SYMLINKS flag is
given, which will cause clones of a repository with a symlink'd
"$GIT_DIR/objects" directory to fail.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, t5619 demonstrates an issue where two calls to
`get_repo_path()` could trick Git into using its local clone mechanism
in conjunction with a non-local transport.
That sequence is:
- the starting state is that the local path https:/example.com/foo is a
symlink that points to ../../../.git/modules/foo. So it's dangling.
- get_repo_path() sees that no such path exists (because it's
dangling), and thus we do not canonicalize it into an absolute path
- because we're using --separate-git-dir, we create .git/modules/foo.
Now our symlink is no longer dangling!
- we pass the url to transport_get(), which sees it as an https URL.
- we call get_repo_path() again, on the url. This second call was
introduced by f38aa83f9a (use local cloning if insteadOf makes a
local URL, 2014-07-17). The idea is that we want to pull the url
fresh from the remote.c API, because it will apply any aliases.
And of course now it sees that there is a local file, which is a
mismatch with the transport we already selected.
The issue in the above sequence is calling `transport_get()` before
deciding whether or not the repository is indeed local, and not passing
in an absolute path if it is local.
This is reminiscent of a similar bug report in [1], where it was
suggested to perform the `insteadOf` lookup earlier. Taking that
approach may not be as straightforward, since the intent is to store the
original URL in the config, but to actually fetch from the insteadOf
one, so conflating the two early on is a non-starter.
Note: we pass the path returned by `get_repo_path(remote->url[0])`,
which should be the same as `repo_name` (aside from any `insteadOf`
rewrites).
We *could* pass `absolute_pathdup()` of the same argument, which
86521acaca (Bring local clone's origin URL in line with that of a remote
clone, 2008-09-01) indicates may differ depending on the presence of
".git/" for a non-bare repo. That matters for forming relative submodule
paths, but doesn't matter for the second call, since we're just feeding
it to the transport code, which is fine either way.
[1]: https://lore.kernel.org/git/CAMoD=Bi41mB3QRn3JdZL-FGHs4w3C2jGpnJB-CqSndO7FMtfzA@mail.gmail.com/
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning a repository, Git must determine (a) what transport
mechanism to use, and (b) whether or not the clone is local.
Since f38aa83f9a (use local cloning if insteadOf makes a local URL,
2014-07-17), the latter check happens after the remote has been
initialized, and references the remote's URL instead of the local path.
This is done to make it possible for a `url.<base>.insteadOf` rule to
convert a remote URL into a local one, in which case the `clone_local()`
mechanism should be used.
However, with a specially crafted repository, Git can be tricked into
using a non-local transport while still setting `is_local` to "1" and
using the `clone_local()` optimization. The below test case
demonstrates such an instance, and shows that it can be used to include
arbitrary (known) paths in the working copy of a cloned repository on a
victim's machine[^1], even if local file clones are forbidden by
`protocol.file.allow`.
This happens in a few parts:
1. We first call `get_repo_path()` to see if the remote is a local
path. If it is, we replace the repo name with its absolute path.
2. We then call `transport_get()` on the repo name and decide how to
access it. If it was turned into an absolute path in the previous
step, then we should always treat it like a file.
3. We use `get_repo_path()` again, and set `is_local` as appropriate.
But it's already too late to rewrite the repo name as an absolute
path, since we've already fed it to the transport code.
The attack works by including a submodule whose URL corresponds to a
path on disk. In the below example, the repository "sub" is reachable
via the dumb HTTP protocol at (something like):
http://127.0.0.1:NNNN/dumb/sub.git
However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level
directory called "http:", then nested directories "127.0.0.1:NNNN", and
"dumb") exists within the repository, too.
To determine this, it first picks the appropriate transport, which is
dumb HTTP. It then uses the remote's URL in order to determine whether
the repository exists locally on disk. However, the malicious repository
also contains an embedded stub repository which is the target of a
symbolic link at the local path corresponding to the "sub" repository on
disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git",
pointing to the stub repository via ".git/modules/sub/../../../repo").
This stub repository fools Git into thinking that a local repository
exists at that URL and thus can be cloned locally. The affected call is
in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which
locates a valid repository at that target.
This then causes Git to set the `is_local` variable to "1", and in turn
instructs Git to clone the repository using its local clone optimization
via the `clone_local()` function.
The exploit comes into play because the stub repository's top-level
"$GIT_DIR/objects" directory is a symbolic link which can point to an
arbitrary path on the victim's machine. `clone_local()` resolves the
top-level "objects" directory through a `stat(2)` call, meaning that we
read through the symbolic link and copy or hardlink the directory
contents at the destination of the link.
In other words, we can get steps (1) and (3) to disagree by leveraging
the dangling symlink to pick a non-local transport in the first step,
and then set is_local to "1" in the third step when cloning with
`--separate-git-dir`, which makes the symlink non-dangling.
This can result in data-exfiltration on the victim's machine when
sensitive data is at a known path (e.g., "/home/$USER/.ssh").
The appropriate fix is two-fold:
- Resolve the transport later on (to avoid using the local
clone optimization with a non-local transport).
- Avoid reading through the top-level "objects" directory when
(correctly) using the clone_local() optimization.
This patch merely demonstrates the issue. The following two patches will
implement each part of the above fix, respectively.
[^1]: Provided that any target directory does not contain symbolic
links, in which case the changes from 6f054f9fb3 (builtin/clone.c:
disallow `--local` clones with symlinks, 2022-07-28) will abort the
clone.
Reported-by: yvvdwf <yvvdwf@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On platforms where `size_t` does not have the same width as `unsigned
long`, passing a pointer to the former when a pointer to the latter is
expected can lead to problems.
Windows and 32-bit Linux are among the affected platforms.
In this instance, we want to store the size of the blob that was read in
that variable. However, `read_blob_data_from_index()` passes that
pointer to `read_object_file()` which expects an `unsigned long *`.
Which means that on affected platforms, the variable is not fully
populated and part of its value is left uninitialized. (On Big-Endian
platforms, this problem would be even worse.)
The consequence is that depending on the uninitialized memory's
contents, we may erroneously reject perfectly fine attributes.
Let's address this by passing a pointer to a variable of the expected
data type.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>