We'd prefer to avoid unchecked snprintf calls because
truncation can lead to unexpected results.
These are all cases where truncation shouldn't ever happen,
because the input to snprintf is fixed in size. That makes
them candidates for xsnprintf(), but it's simpler still to
just use the heap, and then nobody has to wonder if "100" is
big enough.
We'll use xstrfmt() where possible, and a strbuf when we need
the resulting size or to reuse the same buffer in a loop.
Signed-off-by: Jeff King <peff@peff.net>
After receive-pack reads the pack header from the client, it
feeds the already-read part to index-pack and unpack-objects
via their --pack-header command-line options. To do so, we
format it into a fixed buffer, then duplicate it into the
child's argv_array.
Our buffer is long enough to handle any possible input, so
this isn't wrong. But it's more complicated than it needs to
be; we can just argv_array_pushf() the final value and avoid
the intermediate copy. This drops the magic number and is
more efficient, too.
Note that we need to push to the argv_array in order, which
means we can't do the push until we are in the "unpack-objects
versus index-pack" conditional. Rather than duplicate the
slightly complicated format specifier, I pushed it into a
helper function.
Signed-off-by: Jeff King <peff@peff.net>
When name-rev needs to format an actual name, we do so into
a fixed-size buffer. That includes the actual ref tip, as
well as any traversal information. Since refs can exceed
1024 bytes, this means you can get a bogus result. E.g.,
doing:
git tag $(perl -e 'print join("/", 1..1024)')
git describe --contains HEAD^
results in ".../282/283", when it should be
".../1023/1024~1".
We can solve this by using a heap buffer. We'll use a
strbuf, which lets us write into the same buffer from our
loop without having to reallocate.
Signed-off-by: Jeff King <peff@peff.net>
Many functions which handle refs use a PATH_MAX-sized buffer
to do so. This is mostly reasonable as we have to write
loose refs into the filesystem, and at least on Linux the 4K
PATH_MAX is big enough that nobody would care. But:
1. The static PATH_MAX is not always the filesystem limit.
2. On other platforms, PATH_MAX may be much smaller.
3. As we move to alternate ref storage, we won't be bound
by filesystem limits.
Let's convert these to heap buffers so we don't have to
worry about truncation or size limits.
We may want to eventually constrain ref lengths for sanity
and to prevent malicious names, but we should do so
consistently across all platforms, and in a central place
(like the ref code).
Signed-off-by: Jeff King <peff@peff.net>
Part of the reflog content comes from the environment, which
can be much larger than our fixed buffer. Let's use a heap
buffer so we avoid truncating it.
Signed-off-by: Jeff King <peff@peff.net>
We format the tag header into a fixed 1024-byte buffer. But
since the tag-name and tagger ident can be arbitrarily
large, we may unceremoniously die with "tag header too big".
Let's just use a strbuf instead.
Note that it looks at first glance like we can just format
this directly into the "buf" strbuf where it will ultimately
go. But that buffer may already contain the tag message, and
we have no easy way to prepend formatted data to a strbuf
(we can only splice in an already-generated buffer). This
isn't a performance-critical path, so going through an extra
buffer isn't a big deal.
Signed-off-by: Jeff King <peff@peff.net>
The odb_mkstemp() function expects the caller to provide a
fixed buffer to write the resulting tempfile name into. But
it creates the template using snprintf without checking the
return value. This means we could silently truncate the
filename.
In practice, it's unlikely that the truncation would end in
the template-pattern that mkstemp needs to open the file. So
we'd probably end up failing either way, unless the path was
specially crafted.
The simplest fix would be to notice the truncation and die.
However, we can observe that most callers immediately
xstrdup() the result anyway. So instead, let's switch to
using a strbuf, which is easier for them (and isn't a big
deal for the other 2 callers, who can just strbuf_release
when they're done with it).
Note that many of the callers used static buffers, but this
was purely to avoid putting a large buffer on the stack. We
never passed the static buffers out of the function, so
there's no complicated memory handling we need to change.
Signed-off-by: Jeff King <peff@peff.net>
The odb_mkstemp function does not return an error; it dies
on failure instead. But many of its callers compare the
resulting descriptor against -1 and die themselves.
Mostly this is just pointless, but it does raise a question
when looking at the callers: if they show the results of the
"template" buffer after a failure, what's in it? The answer
is: it doesn't matter, because it cannot happen.
So let's make that clear by removing the bogus error checks.
In bitmap_writer_finish(), we can drop the error-handling
code entirely. In the other two cases, it's shared with the
open() in another code path; we can just move the
error-check next to that open() call.
And while we're at it, let's flesh out the function's
docstring a bit to make the error behavior clear.
Signed-off-by: Jeff King <peff@peff.net>
Code clean-up.
* jk/fast-import-cleanup:
pack.h: define largest possible encoded object size
encode_in_pack_object_header: respect output buffer length
fast-import: use xsnprintf for formatting headers
fast-import: use xsnprintf for writing sha1s
"git checkout" is taught the "--recurse-submodules" option.
* sb/checkout-recurse-submodules:
builtin/read-tree: add --recurse-submodules switch
builtin/checkout: add --recurse-submodules switch
entry.c: create submodules when interesting
unpack-trees: check if we can perform the operation for submodules
unpack-trees: pass old oid to verify_clean_submodule
update submodules: add submodule_move_head
submodule.c: get_super_prefix_or_empty
update submodules: move up prepare_submodule_repo_env
submodules: introduce check to see whether to touch a submodule
update submodules: add a config option to determine if submodules are updated
update submodules: add submodule config parsing
make is_submodule_populated gently
lib-submodule-update.sh: define tests for recursing into submodules
lib-submodule-update.sh: replace sha1 by hash
lib-submodule-update: teach test_submodule_content the -C <dir> flag
lib-submodule-update.sh: do not use ./. as submodule remote
lib-submodule-update.sh: reorder create_lib_submodule_repo
submodule--helper.c: remove duplicate code
connect_work_tree_and_git_dir: safely create leading directories
"git describe --dirty" dies when it cannot be determined if the
state in the working tree matches that of HEAD (e.g. broken
repository or broken submodule). The command learned a new option
"git describe --broken" to give "$name-broken" (where $name is the
description of HEAD) in such a case.
* sb/describe-broken:
builtin/describe: introduce --broken flag
Recently we started passing the "--push-options" through the
external remote helper interface; now the "smart HTTP" remote
helper understands what to do with the passed information.
* sb/push-options-via-transport:
remote-curl: allow push options
send-pack: send push options correctly in stateless-rpc case
Code clean-up with minor bugfixes.
* jk/prefix-filename:
bundle: use prefix_filename with bundle path
prefix_filename: simplify windows #ifdef
prefix_filename: return newly allocated string
prefix_filename: drop length parameter
prefix_filename: move docstring to header file
hash-object: fix buffer reuse with --path in a subdirectory
Several callers use fixed buffers for storing the pack
object header, and they've picked 10 as a magic number. This
is reasonable, since it handles objects up to 2^67. But
let's give them a constant so it's clear that the number
isn't pulled out of thin air.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The encode_in_pack_object_header() writes a variable-length
header to an output buffer, but it doesn't actually know
long the buffer is. At first glance, this looks like it
might be possible to overflow.
In practice, this is probably impossible. The smallest
buffer we use is 10 bytes, which would hold the header for
an object up to 2^67 bytes. Obviously we're not likely to
see such an object, but we might worry that an object could
lie about its size (causing us to overflow before we realize
it does not actually have that many bytes). But the argument
is passed as a uintmax_t. Even on systems that have __int128
available, uintmax_t is typically restricted to 64-bit by
the ABI.
So it's unlikely that a system exists where this could be
exploited. Still, it's easy enough to use a normal out/len
pair and make sure we don't write too far. That protects the
hypothetical 128-bit system, makes it harder for callers to
accidentally specify a too-small buffer, and makes the
resulting code easier to audit.
Note that the one caller in fast-import tried to catch such
a case, but did so _after_ the call (at which point we'd
have already overflowed!). This check can now go away.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach remote-curl to understand push options and to be able to convey
them across HTTP.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-describe tells you the version number you're at, or errors out, e.g.
when you run it outside of a repository, which may happen when downloading
a tar ball instead of using git to obtain the source code.
To keep this property of only erroring out, when not in a repository,
severe (submodule) errors must be downgraded to reporting them gently
instead of having git-describe error out completely.
To achieve that a flag '--broken' is introduced, which is in the same
vein as '--dirty' but uses an actual child process to check for dirtiness.
When that child dies unexpectedly, we'll append '-broken' instead of
'-dirty'.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* jk/pack-name-cleanups:
index-pack: make pointer-alias fallbacks safer
replace snprintf with odb_pack_name()
odb_pack_keep(): stop generating keepfile name
sha1_file.c: make pack-name helper globally accessible
move odb_* declarations out of git-compat-util.h
"git show-branch" expected there were only very short branch names
in the repository and used a fixed-length buffer to hold them
without checking for overflow.
* jk/show-branch-lift-name-len-limit:
show-branch: use skip_prefix to drop magic numbers
show-branch: store resolved head in heap buffer
show-branch: drop head_len variable
"git remote rm X", when a branch has remote X configured as the
value of its branch.*.remote, tried to remove branch.*.remote and
branch.*.merge and failed if either is unset.
* rl/remote-allow-missing-branch-name-merge:
remote: ignore failure to remove missing branch.<name>.merge
A "gc.log" file left by a backgrounded "gc --auto" disables further
automatic gc; it has been taught to run at least once a day (by
default) by ignoring a stale "gc.log" file that is too old.
* dt/gc-ignore-old-gc-logs:
gc: ignore old gc.log files
We may take the path to a bundle file as an argument, and
need to adjust the filename based on the prefix we
discovered while setting up the git directory. We do so
manually into a fixed-size buffer, but using
prefix_filename() is the normal way.
Besides being more concise, there are two subtle
improvements:
1. The original inserted a "/" between the two paths, even
though the "prefix" argument always has the "/"
appended. That means that:
cd subdir && git bundle verify ../foo.bundle
was looking at (and reporting) subdir//../foo.bundle.
Harmless, but ugly. Using prefix_filename() gets this
right.
2. The original checked for an absolute path by looking
for a leading '/'. It should have been using
is_absolute_path(), which also covers more cases on
Windows (backslashes and dos drive prefixes).
But it's easier still to just pass the name to
prefix_filename(), which handles this case
automatically.
Note that we'll just leak the resulting buffer in the name
of simplicity, since it needs to last through the duration
of the program anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prefix_filename() function returns a pointer to static
storage, which makes it easy to use dangerously. We already
fixed one buggy caller in hash-object recently, and the
calls in apply.c are suspicious (I didn't dig in enough to
confirm that there is a bug, but we call the function once
in apply_all_patches() and then again indirectly from
parse_chunk()).
Let's make it harder to get wrong by allocating the return
value. For simplicity, we'll do this even when the prefix is
empty (and we could just return the original file pointer).
That will cause us to allocate sometimes when we wouldn't
otherwise need to, but this function isn't called in
performance critical code-paths (and it already _might_
allocate on any given call, so a caller that cares about
performance is questionable anyway).
The downside is that the callers need to remember to free()
the result to avoid leaking. Most of them already used
xstrdup() on the result, so we know they are OK. The
remainder have been converted to use free() as appropriate.
I considered retaining a prefix_filename_unsafe() for cases
where we know the static lifetime is OK (and handling the
cleanup is awkward). This is only a handful of cases,
though, and it's not worth the mental energy in worrying
about whether the "unsafe" variant is OK to use in any
situation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function takes the prefix as a ptr/len pair, but in
every caller the length is exactly strlen(ptr). Let's
simplify the interface and just take the string. This saves
callers specifying it (and in some cases handling a NULL
prefix).
In a handful of cases we had the length already without
calling strlen, so this is technically slower. But it's not
likely to matter (after all, if the prefix is non-empty
we'll allocate and copy it into a buffer anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hash-object command uses prefix_filename() without
duplicating its return value. Since that function returns a
static buffer, the value is overwritten by subsequent calls.
This can cause incorrect results when we use --path along
with hashing a file by its relative path, both of which need
to call prefix_filename(). We overwrite the filename
computed for --path, effectively ignoring it.
We can fix this by calling xstrdup on the return value. Note
that we don't bother freeing the "vpath" instance, as it
remains valid until the program exit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git difftool --dir-diff" used to die a controlled death giving a
"fatal" message when encountering a locally modified symbolic link,
but it started segfaulting since v2.12. This has been fixed.
* js/difftool-builtin:
difftool: handle modified symlinks in dir-diff mode
t7800: cleanup cruft left behind by tests
t7800: remove whitespace before redirect
The string after_subject is added to a strbuf by pp_title_line() if
it's not NULL. Adding an empty string has the same effect as not
adding anything, but the latter is easier, so don't bother changing
the context member from NULL to "".
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of counting the arguments to see if there are any and then
building the full command use a single loop and add the hook command
just before the first argument. This reduces duplication and overall
code size.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 0281e487fd ("grep: optionally recurse into submodules")
added functions grep_submodule() and grep_submodule_launch() which
use "struct work_item" which is defined only when thread support
is available.
The original implementation of grep_submodule() used the "struct
work_item" in order to gain access to a strbuf to store its output which
was to be printed at a later point in time. This differs from how both
grep_file() and grep_sha1() handle their output. This patch eliminates
the reliance on the "struct work_item" and instead opts to use the
output function stored in the output field of the "struct grep_opt"
object directly, making it behave similarly to both grep_file() and
grep_sha1().
Reported-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git revert -m 0 $merge_commit" complained that reverting a merge
needs to say relative to which parent the reversion needs to
happen, as if "-m 0" weren't given. The correct diagnosis is that
"-m 0" does not refer to the first parent ("-m 1" does). This has
been fixed.
* jk/cherry-pick-0-mainline:
cherry-pick: detect bogus arguments to --mainline
From a working tree of a repository, a new option of "rev-parse"
lets you ask if the repository is used as a submodule of another
project, and where the root level of the working tree of that
project (i.e. your superproject) is.
* sb/rev-parse-show-superproject-root:
rev-parse: add --show-superproject-working-tree
The experimental "split index" feature has gained a few
configuration variables to make it easier to use.
* cc/split-index-config: (22 commits)
Documentation/git-update-index: explain splitIndex.*
Documentation/config: add splitIndex.sharedIndexExpire
read-cache: use freshen_shared_index() in read_index_from()
read-cache: refactor read_index_from()
t1700: test shared index file expiration
read-cache: unlink old sharedindex files
config: add git_config_get_expiry() from gc.c
read-cache: touch shared index files when used
sha1_file: make check_and_freshen_file() non static
Documentation/config: add splitIndex.maxPercentChange
t1700: add tests for splitIndex.maxPercentChange
read-cache: regenerate shared index if necessary
config: add git_config_get_max_percent_split_change()
Documentation/git-update-index: talk about core.splitIndex config var
Documentation/config: add information for core.splitIndex
t1700: add tests for core.splitIndex
update-index: warn in case of split-index incoherency
read-cache: add and then use tweak_split_index()
split-index: add {add,remove}_split_index() functions
config: add git_config_get_split_index()
...
A new known failure mode is introduced[1], which is actually not
a failure but a feature in read-tree. Unlike checkout for which
the recursive submodule tests were originally written, read-tree does
warn about ignored untracked files that would be overwritten.
For the sake of keeping the test library for submodules generic, just
mark the test as a failure.
[1] KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This exposes a flag to recurse into submodules
in builtin/checkout making use of the code implemented
in prior patches.
A new failure mode is introduced in the submodule
update library, as the directory/submodule conflict
is not solved in prior patches.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The final() function accepts a NULL value for certain
parameters, and falls back to writing into a reusable "name"
buffer, and then either:
1. For "keep_name", requiring all uses to do "keep_name ?
keep_name : name.buf". This is awkward, and it's easy
to accidentally look at the maybe-NULL keep_name.
2. For "final_index_name" and "final_pack_name", aliasing
those pointers to the "name" buffer. This is easier to
use, but the aliased pointers become invalid after the
buffer is reused (this isn't a bug now, but it's a
potential pitfall).
One way to make this safer would be to introduce an extra
pointer to do the aliasing, and have its lifetime match the
validity of the "name" buffer. But it's still easy to
accidentally use the wrong name (i.e., to use
"final_pack_name" instead of the aliased pointer).
Instead, let's use three separate buffers that will remain
valid through the function. That makes it safe to alias the
pointers and use them consistently. The extra allocations
shouldn't matter, as this function is not performance
sensitive.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In several places we write the name of the pack filename
into a fixed-size buffer using snprintf(), but do not check
the return value. As a result, a very long object directory
could cause us to quietly truncate the pack filename
(potentially leading to a corrupted repository, as a newly
written packfile could be missing its .pack extension).
We can use odb_pack_name() to do this with a strbuf (and
shorten the code, as well).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The odb_pack_keep() function generates the name of a .keep
file and opens it. This has two problems:
1. It requires a fixed-size buffer to create the filename
and doesn't notice when the result is truncated.
2. Of the two callers, one sometimes wants to open a
filename it already has, which makes things awkward (it
has to do so manually, and skips the leading-directory
creation).
Instead, let's have odb_pack_keep() just open the file.
Generating the name isn't hard, and a future patch will
switch callers over to odb_pack_name() anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We need the gentle version in a later patch. As we have just one caller,
migrate the caller.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove code fragment from module_clone that duplicates functionality
of connect_work_tree_and_git_dir in dir.c
Signed-off-by: Valery Tolstov <me@vtolstov.org>
Reviewed-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of these options do the same thing "--foo" iterates over
the "foo" refs, and "--foo=<glob>" does the same with a
glob. We can factor this into its own function to avoid
repeating ourselves.
There are two subtleties to note:
- the original called for_each_branch_ref(), etc, in the
non-glob case. Now we will call for_each_ref_in("refs/heads/")
which is exactly what for_each_branch_ref() did under
the hood.
- for --glob, we'll call for_each_glob_ref_in() with a
NULL "prefix" argument. Which is exactly what
for_each_glob_ref() was doing already.
So both cases should behave identically, and it seems
reasonable to assume that this will remain the same. The
functions we are calling now are the more-generic ones, and
the ones we are dropping are just convenience wrappers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can't just use a bare skip_prefix() for these cases,
because we need to match both the "--foo" form and the
"--foo=<value>" form (and tell the difference between the
two in the caller).
We can wrap this in a simple helper which has two obvious
callsites, and will gain some more in the next patch.
Note that the error output for abbrev-ref changes slightly,
as we don't keep our original "arg" pointer. However, the
new output should hopefully be more clear:
[before]
fatal: unknown mode for --abbrev-ref=foo
[after]
fatal: unknown mode for --abbrev-ref: foo
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using skip_prefix lets us avoid manually-counted offsets
into the argument string. This patch converts the simple and
obvious cases.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cherry-pick and revert commands use OPT_INTEGER() to
parse --mainline. The stock parser is smart enough to reject
non-numeric nonsense, but it doesn't know that parent
counting starts at 1.
Worse, the value "0" is indistinguishable from the unset
case, so a user who assumes the counting is 0-based will get
a confusing message:
$ git cherry-pick -m 0 $merge
error: commit ... is a merge but no -m option was given.
Let's use a custom callback that enforces our range.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect the null object ID for symlinks in dir-diff so that difftool can
detect when symlinks are modified in the worktree.
Previously, a null symlink object ID would crash difftool.
Handle null object IDs as unknown content that must be read from
the worktree.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>