The current parsing scheme for upload-archive is to pack
arguments into a fixed-size buffer, separated by NULs, and
put a pointer to each argument in the buffer into a
fixed-size argv array.
This works fine, and the limits are high enough that nobody
reasonable is going to hit them, but it makes the code hard
to follow. Instead, let's just stuff the arguments into an
argv_array, which is much simpler. That lifts the "all
arguments must fit inside 4K together" limit.
We could also trivially lift the MAX_ARGS limitation (in
fact, we have to keep extra code to enforce it). But that
would mean a client could force us to allocate an arbitrary
amount of memory simply by sending us "argument" lines. By
limiting the MAX_ARGS, we limit an attacker to about 4
megabytes (64 times a maximum 64K packet buffer). That may
sound like a lot compared to the 4K limit, but it's not a
big deal compared to what git-archive will actually allocate
while working (e.g., to load blobs into memory). The
important thing is that it is bounded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the comment, enter_repo will modify its input.
However, this has not been the case since 1c64b48
(enter_repo: do not modify input, 2011-10-04). Drop the
now-useless copy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code predates prefixcmp, so it used memcmp along with
static sizes. Replacing these memcmps with prefixcmp makes
the code much more readable, and the lack of static sizes
will make refactoring it in future patches simpler.
Note that we used to be unnecessarily liberal in parsing the
"unpack" status line, and would accept "unpack ok\njunk". No
version of git has ever produced that, and it violates the
BNF in Documentation/technical/pack-protocol.txt. Let's take
this opportunity to tighten the check by converting the
prefix comparison into a strcmp.
While we're in the area, let's also fix a vague error
message that does not follow our usual conventions (it
writes directly to stderr and does not use the "error:"
prefix).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we read acks from the remote, we expect either:
ACK <sha1>
or
ACK <sha1> <multi-ack-flag>
We parse the "ACK <sha1>" bit from the line, and then start
looking for the flag strings at "line+45"; if we don't have
them, we assume it's of the first type. But if we do have
the first type, then line+45 is not necessarily inside our
string at all!
It turns out that this works most of the time due to the way
we parse the packets. They should come in with a newline,
and packet_read puts an extra NUL into the buffer, so we end
up with:
ACK <sha1>\n\0
with the newline at offset 44 and the NUL at offset 45. We
then strip the newline, putting a NUL at offset 44. So
when we look at "line+45", we are looking past the end of
our string; but it's OK, because we hit the terminator from
the original string.
This breaks down, however, if the other side does not
terminate their packets with a newline. In that case, our
packet is one character shorter, and we start looking
through uninitialized memory for the flag. No known
implementation sends such a packet, so it has never come up
in practice.
This patch tightens the check by looking for a short,
flagless ACK before trying to parse the flag.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you set the GIT_DEBUG_SEND_PACK environment variable,
upload-pack will dump lines it receives in the receive_needs
phase to a descriptor. This debugging harness is a strict
subset of what GIT_TRACE_PACKET can do. Let's just drop it
in favor of that.
A few tests used GIT_DEBUG_SEND_PACK to confirm which
objects get sent; we have to adapt them to the new output
format.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the client tells us it has a shallow object via
"shallow <sha1>", we make sure we have the object, mark it
with a flag, then add it to a dynamic array of shallow
objects. This means that a client can get us to allocate
arbitrary amounts of memory just by flooding us with shallow
lines (whether they have the objects or not). You can
demonstrate it easily with:
yes '0035shallow e83c5163316f89bfbde7d9ab23ca2e25604af290' |
git-upload-pack git.git
We already protect against duplicates in want lines by
checking if our flag is already set; let's do the same thing
here. Note that a client can still get us to allocate some
amount of memory by marking every object in the repo as
"shallow" (or "want"). But this at least bounds it with the
number of objects in the repository, which is not under the
control of an upload-pack client.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we receive a line like "shallow <sha1>" from the
client, we feed the <sha1> part to get_sha1. This is a
mistake, as the argument on a shallow line is defined by
Documentation/technical/pack-protocol.txt to contain an
"obj-id". This is never defined in the BNF, but it is clear
from the text and from the other uses that it is meant to be
a hex sha1, not an arbitrary identifier (and that is what
fetch-pack has always sent).
We should be using get_sha1_hex instead, which doesn't allow
the client to request arbitrary junk like "HEAD@{yesterday}".
Because this is just marking shallow objects, the client
couldn't actually do anything interesting (like fetching
objects from unreachable reflog entries), but we should keep
our parsing tight to be on the safe side.
Because get_sha1 is for the most part a superset of
get_sha1_hex, in theory the only behavior change should be
disallowing non-hex object references. However, there is
one interesting exception: get_sha1 will only parse
a 40-character hex sha1 if the string has exactly 40
characters, whereas get_sha1_hex will just eat the first 40
characters, leaving the rest. That means that current
versions of git-upload-pack will not accept a "shallow"
packet that has a trailing newline, even though the protocol
documentation is clear that newlines are allowed (even
encouraged) in non-binary parts of the protocol.
This never mattered in practice, though, because fetch-pack,
contrary to the protocol documentation, does not include a
newline in its shallow lines. JGit follows its lead (though
it correctly is strict on the parsing end about wanting a
hex object id).
We do not adjust fetch-pack to send newlines here, as it
would break communication with older versions of git (and
there is no actual benefit to doing so, except for
consistency with other parts of the protocol).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow the server side to redact the refs/ namespace it shows to the
client.
Will merge to 'master'.
* jc/hidden-refs:
upload/receive-pack: allow hiding ref hierarchies
upload-pack: simplify request validation
upload-pack: share more code
Add diff.algorithm configuration so that the user does not type
"diff --histogram".
* mp/diff-algo-config:
diff: Introduce --diff-algorithm command line option
config: Introduce diff.algorithm variable
git-completion.bash: Autocomplete --minimal and --histogram for git-diff
Allows skipping the untracked check GIT_PS1_SHOWUNTRACKEDFILES
asks for the git-prompt (in contrib/) per repository.
* mw/bash-prompt-show-untracked-config:
t9903: add extra tests for bash.showDirtyState
t9903: add tests for bash.showUntrackedFiles
shell prompt: add bash.showUntrackedFiles option
"git log --grep=<pattern>" used to look for the pattern in literal
bytes of the commit log message and ignored the log-output encoding.
* jk/read-commit-buffer-data-after-free:
log: re-encode commit messages before grepping
"make COMPUTE_HEADER_DEPENDENCIES=no clean" would try to run "rm
-rf $(dep_dirs)" with an empty dep_dir, but some implementations of
"rm -rf" barf on an empty argument list.
* mk/make-rm-depdirs-could-be-empty:
Makefile: don't run "rm" without any files
The merge at 5bf72ed2 missed another instance of <filepattern> that
we were converting to <pathspec>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
acd2a45 (Refuse updating the current branch in a non-bare repository
via push, 2009-02-11) changed the default to refuse such a push, but
it forgot to update the docs.
7d182f5 (Documentation: receive.denyCurrentBranch defaults to
'refuse', 2010-03-17) updated Documentation/config.txt, but forgot to
update the user manual.
Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactors a lot of repetitive code sequence from the graph drawing
code and adds it to the combined diff output.
* jk/diff-graph-cleanup:
combine-diff.c: teach combined diffs about line prefix
diff.c: use diff_line_prefix() where applicable
diff: add diff_line_prefix function
diff.c: make constant string arguments const
diff: write prefix to the correct file
graph: output padding for merge subsequent parents
* bw/get-tz-offset-perl:
cvsimport: format commit timestamp ourselves without using strftime
perl/Git.pm: fix get_tz_offset to properly handle DST boundary cases
Move Git::SVN::get_tz to Git::get_tz_offset
Use a new helper that prints a message and counts its display width
to align the help messages parse-options produces.
* jx/utf8-printf-width:
Add utf8_fprintf helper that returns correct number of columns
Instead of requiring the full 40-hex object names on the index
line, we can read submodule commit object names from the textual
diff when synthesizing a fake ancestore tree for "git am -3".
* jc/extended-fake-ancestor-for-gitlink:
apply: verify submodule commit object name better
contrib/subtree updates, but here are only the ones that looked
ready. The remainder of the patches will have another day.
* dg/subtree-fixes:
contrib/subtree: make the manual directory if needed
contrib/subtree: honor DESTDIR
contrib/subtree: fix synopsis
contrib/subtree: better error handling for 'subtree add'
contrib/subtree: use %B for split subject/body
contrib/subtree: remove test number comments
Add 3 extra tests for the bash.showDirtyState config option; the
tests now cover all combinations of the shell var being set/unset
and the config option being missing/enabled/disabled, given a dirty
file.
Signed-off-by: Martin Erik Werner <martinerikwerner@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add 4 tests for the bash.showUntrackedFiles config option, covering
all combinations of the shell var being set/unset and the config
option being enabled/disabled (the other 2 cases, missing config
with and without shell variable, are already covered by existing
tests).
Signed-off-by: Martin Erik Werner <martinerikwerner@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When COMPUTE_HEADER_DEPENDENCIES is set to "auto" and the compiler
does not support it, $(dep_dirs) becomes empty. "make clean" runs
"rm -rf $(dep_dirs)", which can fail in such a case.
Signed-off-by: Matt Kraai <matt.kraai@amo.abbott.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a config option 'bash.showUntrackedFiles' which allows enabling
the prompt showing untracked files on a per-repository basis. This is
useful for some repositories where the 'git ls-files ...' command may
take a long time.
Signed-off-by: Martin Erik Werner <martinerikwerner@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit eff80a9 (Allow custom "comment char") introduced a custom comment
character for commit messages but did not teach git-rebase--interactive
to use it.
Change git-rebase--interactive to read core.commentchar and use its
value when generating commit messages and for the command list.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running "git log --graph --cc -p" the diff output for merges is not
indented by the graph structure, unlike the diffs of non-merge commits
(added in commit 7be5761 - diff.c: Output the text graph padding before
each diff line).
Fix this by teaching the combined diff code to output diff_line_prefix()
before each line.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a helper function to call the diff output_prefix function and
return its value as a C string, allowing us to greatly simplify
everywhere that needs to get the output prefix.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Write the prefix for an output line to the same file as the actual
content.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This needs to be done in two places: __git_config_get_set_variables to
allow clever completion of "git config --local --get foo<tab>", and
_git_config to allow "git config --loc<tab>" to complete to --local.
While we're there, change the order of options in the code to match
git-config.txt.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pathspec is the most widely used term, and is the one defined in
gitglossary.txt. <filepattern> was used only in the synopsys for git-add
and git-commit, and in git-add.txt. Get rid of it.
This patch is obtained with by running:
perl -pi -e 's/filepattern/pathspec/' `git grep -l filepattern`
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because our command-line parser considers only one byte at the time
for short-options, we incorrectly report only the first byte when
multi-byte input was provided. This makes user-errors slightly
awkward to diagnose for instance under UTF-8 locale and non-English
keyboard layouts.
Report the whole argument-string when a non-ASCII short-option is
detected.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Improved-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
expat 1.1 and 1.2 provide xmlparse.h instead of expat.h. Include the
former on systems that define the EXPAT_NEEDS_XMLPARSE_H variable and
define that variable on QNX systems, which ship with expat 1.1.
Signed-off-by: Matt Kraai <matt.kraai@amo.abbott.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you run "git log --grep=foo", we will run your regex on
the literal bytes of the commit message. This can provide
confusing results if the commit message is not in the same
encoding as your grep expression (or worse, you have commits
in multiple encodings, in which case your regex would need
to be written to match either encoding). On top of this, we
might also be grepping in the commit's notes, which are
already re-encoded, potentially leading to grepping in a
buffer with mixed encodings concatenated. This is insanity,
but most people never noticed, because their terminal and
their commit encodings all match.
Instead, let's massage the to-be-grepped commit into a
standardized encoding. There is not much point in adding a
flag for "this is the encoding I expect my grep pattern to
match"; the only sane choice is for it to use the log output
encoding. That is presumably what the user's terminal is
using, and it means that the patterns found by the grep will
match the output produced by git.
As a bonus, this fixes a potential segfault in commit_match
when commit->buffer is NULL, as we now build on logmsg_reencode,
which handles reading the commit buffer from disk if
necessary. The segfault can be triggered with:
git commit -m 'text1' --allow-empty
git commit -m 'text2' --allow-empty
git log --graph --no-walk --grep 'text2'
which arguably does not make any sense (--graph inherently
wants a connected history, and by --no-walk the command line
is telling us to show discrete points in history without
connectivity), and we probably should forbid the
combination, but that is a separate issue.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>