The init code predates strbufs, and uses PATH_MAX-sized
buffers along with many manual checks on intermediate sizes
(some of which make magic assumptions, such as that init
will not create a path inside .git longer than 50
characters).
We can simplify this greatly by using strbufs, which drops
some hard-to-verify strcpy calls in favor of git_path_buf.
While we're in the area, let's also convert existing calls
to git_path to the safer git_path_buf (our existing calls
were passed to pretty tame functions, and so were not a
problem, but it's easy to be consistent and safe here).
Note that we had an explicit test that "git init" rejects
long template directories. This comes from 32d1776 (init: Do
not segfault on big GIT_TEMPLATE_DIR environment variable,
2009-04-18). We can drop the test_must_fail here, as we now
accept this and need only confirm that we don't segfault,
which was the original point of the test.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are initializing a .git directory, we may call
probe_utf8_pathname_composition to detect utf8 mangling. We
pass in a path buffer for it to use, and it blindly
strcpy()s into it, not knowing whether the buffer is large
enough to hold the result or not.
In practice this isn't a big deal, because the buffer we
pass in already contains "$GIT_DIR/config", and we append
only a few extra bytes to it. But we can easily do the right
thing just by calling git_path_buf ourselves. Technically
this results in a different pathname (before we appended our
utf8 characters to the "config" path, and now they get their
own files in $GIT_DIR), but that should not matter for our
purposes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The result of iconv is assigned to a variable, but we never
use it (instead, we check errno and whether the function
consumed all bytes). Let's drop the assignment, as it
triggers gcc's -Wunused-but-set-variable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add example implementation including test cases for the large file
system using Git LFS.
Pushing files to the Git LFS server is not tested.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce repositories can contain large (binary) files. Migrating these
repositories to Git generates very large local clones. External storage
systems such as Git LFS [1], Git Fat [2], Git Media [3], git-annex [4]
try to address this problem.
Add a generic mechanism to detect large files based on extension,
uncompressed size, and/or compressed size.
[1] https://git-lfs.github.com/
[2] https://github.com/jedbrown/git-fat
[3] https://github.com/alebedev/git-media
[4] https://git-annex.branchable.com/
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Conflicts:
Documentation/git-p4.txt
git-p4.py
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4 will just halt if there is not enough disk space while
streaming content from P4 to Git. Add a check to ensure at least
4 times (arbitrarily chosen) the size of a streamed file is available.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a file is streamed from P4 to Git then the verbose mode prints
continuously the progress as percentage like this:
//depot/file.bin 20% (10 MB)
Upon completion the progress is overwritten with depot source, local
file and size like this:
//depot/file.bin --> local/file.bin (10 MB)
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a git config reader for integer variables. Please note that the
git config implementation automatically supports k, m, and g suffixes.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions "gitConfig" and "gitConfigBool" are almost identical.
Make "gitConfig" more generic by adding an optional type specifier.
Use the type specifier "--bool" with "gitConfig" to implement
"gitConfigBool. This prepares the implementation of other type
specifiers such as "--int".
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Acked-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
worktree.c contains functions to work with and get information from
worktrees. This introduction moves functions related to worktrees
from branch.c into worktree.c
Signed-off-by: Michael Rappazzo <rappazzo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-rebase-todo is parsed several times with different parsers. In
principle, the user input is normalized by transform_todo_ids and
further parsing can be stricter.
In case the user wrote
pick deadbeef<TAB>commit message
the parser of transform_todo_ids was considering the sha1 to be
"deadbeef<TAB>commit", and was leaving the tab in the transformed sheet.
In practice, this went unnoticed since the actual command interpretation
was done later in do_next which did accept the tab as a separator.
Make it explicit in the code of transform_todo_ids that tabs are
accepted. This way, code that mimicks it will also accept tabs as
separator.
A similar construct appears in skip_unnecessary_picks, but this one
comes after transform_todo_ids, hence reads the normalized format, so it
needs not be changed.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After using "git checkout --detach", the reflog is left with an entry
like
checkout: moving from ... to HEAD
This message is parsed to generate the 'HEAD detached at' message in
'git branch' and 'git status', which leads to the not-so-useful message
'HEAD detached at HEAD'.
Instead, when parsing such reflog entry, resolve HEAD to the
corresponding commit in the reflog, so that the message becomes 'HEAD
detached at $sha1'.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This currently fails: the output is 'HEAD detached at HEAD'.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new builtin am ignores the user.signingkey variable: gpg is being
called with the committer details as the key ID, which may not be
correct. git_gpg_config is responsible for handling that variable and is
expected to be called on initialization by any modules that use gpg.
Signed-off-by: Renee Margaret McConahy <nepella@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes sending huge patches/commits fail with
[Net::SMTP::SSL] Connection closed at /usr/lib/git-core/git-send-email
line 1320.
Running the command with --smtp-debug=1 yields to
Net::SMTP::SSL: Net::Cmd::datasend(): unexpected EOF on command channel:
at /usr/lib/git-core/git-send-email line 1320.
[Net::SMTP::SSL] Connection closed at /usr/lib/git-core/git-send-email
line 1320.
Stefan described it in his mail like this:
It seems to me that there is a size limit, after cutting down the patch
to ~16K, sending started to work. I cut it twice, once by removing lines
from the head and once from the bottom, in both cases at the size of
around 16K I could send the patch.
See also original report:
http://permalink.gmane.org/gmane.comp.version-control.git/274569
Reported-by: Juston Li <juston.h.li@gmail.com>
Tested-by: Markos Chandras <hwoarang@gentoo.org>
Signed-off-by: Lars Wendler <polynomial-c@gentoo.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit enforces MAX_XDIFF_SIZE at the
interfaces to xdiff: xdi_diff (which calls xdl_diff) and
ll_xdl_merge (which calls xdl_merge).
But we have another direct call to xdl_merge in
merge-file.c. If it were written today, this probably would
just use the ll_merge machinery. But it predates that code,
and uses slightly different options to xdl_merge (e.g.,
ZEALOUS_ALNUM).
We could try to abstract out an xdi_merge to match the
existing xdi_diff, but even that is difficult. Rather than
simply report error, we try to treat large files as binary,
and that distinction would happen outside of xdi_merge.
The simplest fix is to just replicate the MAX_XDIFF_SIZE
check in merge-file.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The xdiff code is not prepared to handle extremely large
files. It uses "int" in many places, which can overflow if
we have a very large number of lines or even bytes in our
input files. This can cause us to produce incorrect diffs,
with no indication that the output is wrong. Or worse, we
may even underallocate a buffer whose size is the result of
an overflowing addition.
We're much better off to tell the user that we cannot diff
or merge such a large file. This patch covers both cases,
but in slightly different ways:
1. For merging, we notice the large file and cleanly fall
back to a binary merge (which is effectively "we cannot
merge this").
2. For diffing, we make the binary/text distinction much
earlier, and in many different places. For this case,
we'll use the xdi_diff as our choke point, and reject
any diff there before it hits the xdiff code.
This means in most cases we'll die() immediately after.
That's not ideal, but in practice we shouldn't
generally hit this code path unless the user is trying
to do something tricky. We already consider files
larger than core.bigfilethreshold to be binary, so this
code would only kick in when that is circumvented
(either by bumping that value, or by using a
.gitattribute to mark a file as diffable).
In other words, we can avoid being "nice" here, because
there is already nice code that tries to do the right
thing. We are adding the suspenders to the nice code's
belt, so notice when it has been worked around (both to
protect the user from malicious inputs, and because it
is better to die() than generate bogus output).
The maximum size was chosen after experimenting with feeding
large files to the xdiff code. It's just under a gigabyte,
which leaves room for two obvious cases:
- a diff3 merge conflict result on files of maximum size X
could be 3*X plus the size of the markers, which would
still be only about 3G, which fits in a 32-bit int.
- some of the diff code allocates arrays of one int per
record. Even if each file consists only of blank lines,
then a file smaller than 1G will have fewer than 1G
records, and therefore the int array will fit in 4G.
Since the limit is arbitrary anyway, I chose to go under a
gigabyte, to leave a safety margin (e.g., we would not want
to overflow by allocating "(records + 1) * sizeof(int)" or
similar.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call into xdiff to perform a diff, we generally lose
the return code completely. Typically by ignoring the return
of our xdi_diff wrapper, but sometimes we even propagate
that return value up and then ignore it later. This can
lead to us silently producing incorrect diffs (e.g., "git
log" might produce no output at all, not even a diff header,
for a content-level diff).
In practice this does not happen very often, because the
typical reason for xdiff to report failure is that it
malloc() failed (it uses straight malloc, and not our
xmalloc wrapper). But it could also happen when xdiff
triggers one our callbacks, which returns an error (e.g.,
outf() in builtin/rerere.c tries to report a write failure
in this way). And the next patch also plans to add more
failure modes.
Let's notice an error return from xdiff and react
appropriately. In most of the diff.c code, we can simply
die(), which matches the surrounding code (e.g., that is
what we do if we fail to load a file for diffing in the
first place). This is not that elegant, but we are probably
better off dying to let the user know there was a problem,
rather than simply generating bogus output.
We could also just die() directly in xdi_diff, but the
callers typically have a bit more context, and can provide a
better message (and if we do later decide to pass errors up,
we're one step closer to doing so).
There is one interesting case, which is in diff_grep(). Here
if we cannot generate the diff, there is nothing to match,
and we silently return "no hits". This is actually what the
existing code does already, but we make it a little more
explicit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
-u <exec> has never been supported, but it was mentioned since
0a2bb55 (git ls-remote: make usage string match manpage -
2008-11-11). Nobody has complained about it for seven years, it's
probably safe to say nobody cares. So let's remove "-u" in documents
instead of adding code to support it.
While at there, fix --upload-pack syntax too.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git has a config variable to indicate that it is operating on a file
system that is case-insensitive: core.ignoreCase. But the
`dir_inside_of()` function did not respect that. As a result, if Git's
idea of the current working directory disagreed in its upper/lower case
with the `GIT_WORK_TREE` variable (e.g. `C:\test` vs `c:\test`) the
user would be greeted by the error message
fatal: git-am cannot be used without a working tree.
when trying to run a rebase.
This fixes https://github.com/git-for-windows/git/issues/402 (reported by
Daniel Harding).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Strict mode is about not guessing where .git is. If the user points to a
.git file, we know exactly where the target .git dir will be. This makes
it possible to serve .git files as repository on the server side.
This may be needed even in local clone case because transport.c code
uses upload-pack for fetching remote refs. But right now the
clone/transport code goes with non-strict.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It matters for linked checkouts where 'refs' directory won't be
available in $GIT_DIR. is_git_directory() knows about $GIT_COMMON_DIR
and can handle this case.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, libcurl will follow circular http redirects
forever. Let's put a cap on this so that somebody who can
trigger an automated fetch of an arbitrary repository (e.g.,
for CI) cannot convince git to loop infinitely.
The value chosen is 20, which is the same default that
Firefox uses.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, libcurl would follow redirection to any protocol
it was compiled for support with. This is desirable to allow
redirection from HTTP to HTTPS. However, it would even
successfully allow redirection from HTTP to SFTP, a protocol
that git does not otherwise support at all. Furthermore
git's new protocol-whitelisting could be bypassed by
following a redirect within the remote helper, as it was
only enforced at transport selection time.
This patch limits redirects within libcurl to HTTP, HTTPS,
FTP and FTPS. If there is a protocol-whitelist present, this
list is limited to those also allowed by the whitelist. As
redirection happens from within libcurl, it is impossible
for an HTTP redirect to a protocol implemented within
another remote helper.
When the curl version git was compiled with is too old to
support restrictions on protocol redirection, we warn the
user if GIT_ALLOW_PROTOCOL restrictions were requested. This
is a little inaccurate, as even without that variable in the
environment, we would still restrict SFTP, etc, and we do
not warn in that case. But anything else means we would
literally warn every time git accesses an http remote.
This commit includes a test, but it is not as robust as we
would hope. It redirects an http request to ftp, and checks
that curl complained about the protocol, which means that we
are relying on curl's specific error message to know what
happened. Ideally we would redirect to a working ftp server
and confirm that we can clone without protocol restrictions,
and not with them. But we do not have a portable way of
providing an ftp server, nor any other protocol that curl
supports (https is the closest, but we would have to deal
with certificates).
[jk: added test and version warning]
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current callers only want to die when their transport is
prohibited. But future callers want to query the mechanism
without dying.
Let's break out a few query functions, and also save the
results in a static list so we don't have to re-parse for
each query.
Based-on-a-patch-by: Blake Burkhart <bburky@bburky.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Asciidoctor is stricter than AsciiDoc when deciding if underlining
is a section title or the start of preformatted text. Make the
length of the underlining match the text to ensure that it renders
correctly in all implementations.
Signed-off-by: John Keeping <john@keeping.me.uk>
[jc: squashed in git-bisect one noticed by Michael J Gruber]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
aggregate.perl did not work when Git.pm is not installed to a directory
contained in the default Perl library path list or PERLLIB.
This commit prepends the Perl library path of the current Git source
tree to enable this.
Note that this commit adds a hard-coded relative path
use lib '../../perl/blib/lib';
instead of the flexible environment-based variant
use lib (split(/:/, $ENV{GITPERLLIB}));
which is used in tests written in Perl.
The hard-coded variant is used because the whole performance test
framework does it that way (and GITPERLLIB is not set there).
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do some manual memory computation here, and there's no
check that our 60 is not overflowed by the raw sprintf (it
isn't, because the "which" parameter is never longer than
"pack"). We can simplify this greatly with a strbuf.
Technically the end result is not identical, as the original
took care not to rewrite the object directory on each call
for performance reasons. We could do that here, too (by
saving the baselen and resetting to it), but it's not worth
the complexity; this function is not called a lot (generally
once per packfile that we open).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do an unchecked sprintf directly into our url buffer.
This doesn't overflow because we know that it was sized for
"$base/objects/info/http-alternates", and we are writing
"$base/objects/info/alternates", which must be smaller. But
that is not immediately obvious to a reader who is looking
for buffer overflows. Let's switch to a strbuf, so that we
do not have to think about this issue at all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The http-push code defines an fwrite_buffer function for use
as a curl callback; it just writes to a strbuf. There's no
reason we need to use it ourselves, as we know we have a
strbuf. This lets us format directly into it, rather than
dealing with an extra temporary buffer (which required
manual length computation).
While we're here, let's also remove the literal tabs from
the source in favor of "\t", which is more visually obvious.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We format a pkt-line into a heap buffer, which requires
manual computation of the required size, and uses some bare
sprintf calls. We could use a strbuf instead, which would
take care of the computation for us. But it's even easier
still to use packet_write(). Besides handling the formatting
and writing for us, it fixes two things:
1. Our manual max-size check used 0xFFFF, while technically
LARGE_PACKET_MAX is slightly smaller than this.
2. Our packet will now be output as part of
GIT_TRACE_PACKET debugging.
Unfortunately packet_write() does not let us build up the
buffer progressively, so we do have to repeat ourselves a
little depending on the "vhost" setting, but the end result
is still far more readable than the original.
Since there were no tests covering this feature at all,
we'll add a few into t5802.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we report an error to the client, we format it into a
fixed-size buffer using vsprintf(). This can't actually
overflow in practice, since we only format a very tame
subset of strings (mostly strerror() output). However, it's
hard to tell immediately, so let's just use a strbuf so
readers do not have to wonder.
We do add an allocation here, but the performance is not
important; the next step is to call die() anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
resolve_ref already uses a strbuf internally when generating
pathnames, but it uses fixed-size buffers for storing the
refname and symbolic refs. This means that you cannot
actually point HEAD to a ref that is larger than 256 bytes.
We can lift this limit by using strbufs here, too. Like
sb_path, we pass the the buffers into our helper function,
so that we can easily clean up all output paths. We can also
drop the "unsafe" name from our helper function, as it no
longer uses a single static buffer (but of course
resolve_ref_unsafe is still unsafe, because the static
buffers moved there).
As a bonus, we also get to drop some strcpy calls between
the two fixed buffers (that cannot currently overflow
because the two buffers are sized identically).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>