Allow "git interpret-trailers" to run outside of a Git repository.
* jk/interpret-trailers-outside-a-repository:
interpret-trailers: allow running outside a repository
Update "git subtree" (in contrib/) so that it can take whitespaces
in the pathnames, not only in the in-tree pathname but the name of
the directory that the repository is in.
* as/subtree-with-spaces:
contrib/subtree: respect spaces in a repository path
t7900-subtree: test the "space in a subdirectory name" case
The ssh transport, just like any other transport over the network,
did not clear GIT_* environment variables, but it is possible to
use SendEnv and AcceptEnv to leak them to the remote invocation of
Git, which is not a good idea at all. Explicitly clear them just
like we do for the local transport.
* jk/connect-clear-env:
git_connect: clarify conn->use_shell flag
git_connect: clear GIT_* environment for ssh
"git log --date=local" used to only show the normal (default)
format in the local timezone. The command learned to take 'local'
as an instruction to use the local timezone with other formats,
e.g. "git show --date=rfc-local".
* jk/date-local:
t6300: add tests for "-local" date formats
t6300: make UTC and local dates different
date: make "local" orthogonal to date format
date: check for "local" before anything else
t6300: add test for "raw" date format
t6300: introduce test_date() helper
fast-import: switch crash-report date to iso8601
Documentation/rev-list: don't list date formats
Documentation/git-for-each-ref: don't list date formats
Documentation/config: don't list date formats
Documentation/blame-options: don't list date formats
Move the refs used during a "git bisect" session to per-worktree
hierarchy refs/worktree/* so that independent bisect sessions can
be done in different worktrees.
* dt/refs-bisection:
refs: make refs/bisect/* per-worktree
path: optimize common dir checking
refs: clean up common_list
Users who are too busy to type three extra keystrokes to ask for
"git stash show -p" can now set stash.showPatch configuration
varible to true to always see the actual patch, not just the list
of paths affected with feel for the extent of damage via diffstat.
* nk/stash-show-config:
stash: allow "stash show" diff output configurable
The debugging infrastructure for pkt-line based communication has
been improved to mark the side-band communication specifically.
* jk/async-pkt-line:
pkt-line: show packets in async processes as "sideband"
run-command: provide in_async query function
"quiltimport" allows to specify the series file by honoring the
$QUILT_SERIES environment and also --series command line option.
* jh/quiltimport-explicit-series-file:
git-quiltimport: add commandline option --series <file>
Correct "git p4 --detect-labels" so that it does not fail to create
a tag that points at a commit that is also being imported.
* ld/p4-import-labels:
git-p4: fix P4 label import for unprocessed commits
git-p4: do not terminate creating tag for unknown commit
git-p4: failing test for ignoring invalid p4 labels
The use of 'good/bad' in "git bisect" made it confusing to use when
hunting for a state change that is not a regression (e.g. bugfix).
The command learned 'old/new' and then allows the end user to
say e.g. "bisect start --term-old=fast --term=new=slow" to find a
performance regression.
Michael's idea to make 'good/bad' more intelligent does have
certain attractiveness ($gname/272867), and makes some of the work
on this topic a moot point.
* ad/bisect-terms:
bisect: allow setting any user-specified in 'git bisect start'
bisect: add 'git bisect terms' to view the current terms
bisect: add the terms old/new
bisect: sanity check on terms
Code clean-up and minor fixes.
* jc/rerere: (21 commits)
rerere: un-nest merge() further
rerere: use "struct rerere_id" instead of "char *" for conflict ID
rerere: call conflict-ids IDs
rerere: further clarify do_rerere_one_path()
rerere: further de-dent do_plain_rerere()
rerere: refactor "replay" part of do_plain_rerere()
rerere: explain the remainder
rerere: explain "rerere forget" codepath
rerere: explain the primary codepath
rerere: explain MERGE_RR management helpers
rerere: fix benign off-by-one non-bug and clarify code
rerere: explain the rerere I/O abstraction
rerere: do not leak mmfile[] for a path with multiple stage #1 entries
rerere: stop looping unnecessarily
rerere: drop want_sp parameter from is_cmarker()
rerere: report autoupdated paths only after actually updating them
rerere: write out each record of MERGE_RR in one go
rerere: lift PATH_MAX limitation
rerere: plug conflict ID leaks
rerere: handle conflicts with multiple stage #1 entries
...
Some features from "git tag -l" and "git branch -l" have been made
available to "git for-each-ref" so that eventually the unified
implementation can be shared across all three, in a follow-up
series or two.
* kn/for-each-tag-branch:
for-each-ref: add '--contains' option
ref-filter: implement '--contains' option
parse-options.h: add macros for '--contains' option
parse-option: rename parse_opt_with_commit()
for-each-ref: add '--merged' and '--no-merged' options
ref-filter: implement '--merged' and '--no-merged' options
ref-filter: add parse_opt_merge_filter()
for-each-ref: add '--points-at' option
ref-filter: implement '--points-at' option
tag: libify parse_opt_points_at()
t6302: for-each-ref tests for ref-filter APIs
The manual size computations here are correct, but using
strip_suffix makes that obvious, and hopefully communicates
the intent of the code more clearly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When working with paths in strbufs, we frequently want to
ensure that a directory contains a trailing slash before
appending to it. We can shorten this code (and make the
intent more obvious) by calling strbuf_complete.
Most of these cases are trivially identical conversions, but
there are two things to note:
- in a few cases we did not check that the strbuf is
non-empty (which would lead to an out-of-bounds memory
access). These were generally not triggerable in
practice, either from earlier assertions, or typically
because we would have just fed the strbuf to opendir(),
which would choke on an empty path.
- in a few cases we indexed the buffer with "original_len"
or similar, rather than the current sb->len, and it is
not immediately obvious from the diff that they are the
same. In all of these cases, I manually verified that
the strbuf does not change between the assignment and
the strbuf_complete call.
This does not convert cases which look like:
if (sb->len && !is_dir_sep(sb->buf[sb->len - 1]))
strbuf_addch(sb, '/');
as those are obviously semantically different. Some of these
cases arguably should be doing that, but that is out of
scope for this change, which aims purely for cleanup with no
behavior change (and at least it will make such sites easier
to find and examine in the future, as we can grep for
strbuf_complete).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 27e1e22 (prune: factor out loose-object directory
traversal, 2014-10-15), we now have a generic callback
system for iterating over the loose object directories. This
is used by prune, count-objects, etc.
We did not convert git-fsck at the time because it
implemented an inode-sorting scheme that was not part of the
generic code. Now that the inode-sorting code is gone, we
can reuse the generic code. The result is shorter,
hopefully more readable, and drops some unchecked sprintf
calls.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that fsck has dropped its inode-sorting, there are no
longer any users of this knob, and it can go away.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fsck tries to access loose objects in order of inode number,
with the hope that this would make cold cache access faster
on a spinning disk. This dates back to 7e8c174 (fsck-cache:
sort entries by inode number, 2005-05-02), which predates
the invention of packfiles.
These days, there's not much point in trying to optimize
cold cache for a large number of loose objects. You are much
better off to simply pack the objects, which will reduce the
disk footprint _and_ provide better locality of data access.
So while you can certainly construct pathological cases
where this code might help, it is not worth the trouble
anymore.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strncpy is known to be a confusing function because of its
termination semantics. These calls are all correct, but it
takes some examination to see why. In particular, every one
of them expects to copy up to the length limit, and then
makes some arrangement for terminating the result.
We can just use memcpy, along with noting explicitly how the
result is terminated (if it is not already obvious). That
should make it more clear to a reader that we are doing the
right thing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We know that a fanned-out sha1 in a notes tree cannot be
more than "aa/bb/cc/...", and we have an assert() to confirm
that. But let's factor out that length into a constant so we
can be sure it is used consistently. And even though we
assert() earlier, let's replace a strcpy with xsnprintf, so
it is clear to a reader that all cases are covered.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To set up default colors, we sometimes strcpy() from the
default string literals into our color buffers. This isn't a
bug (assuming the destination is COLOR_MAXLEN bytes), but
makes it harder to audit the code for problematic strcpy
calls.
Let's introduce a color_set which copies under the
assumption that there are COLOR_MAXLEN bytes in the
destination (of course you can call it on a smaller buffer,
so this isn't providing a huge amount of safety, but it's
more convenient than calling xsnprintf yourself).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we already know the length of a string (e.g., because
we just malloc'd to fit it), it's nicer to use memcpy than
strcpy, as it makes it more obvious that we are not going to
overflow the buffer (because the size we pass matches the
size in the allocation).
This also eliminates calls to strcpy, which make auditing
the code base harder.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are going to launch "/path/to/konqueror", we instead
rewrite this into "/path/to/kfmclient" by duplicating the
original string and writing over the ending bits. This can
be done more obviously with strip_suffix and xstrfmt.
Note that we also fix a subtle bug with the "filename"
parameter, which is passed as argv[0] to the child. If the
user has configured a program name with no directory
component, we always pass the string "kfmclient", even if
your program is called something else. But if you give a
full path, we give the basename of that path. But more
bizarrely, if we rewrite "konqueror" to "kfmclient", we
still pass "konqueror".
The history of this function doesn't reveal anything
interesting, so it looks like just an oversight from
combining the suffix-munging with the basename-finding.
Let's just call basename on the munged path, which produces
consistent results (if you gave a program, whether a full
path or not, we pass its basename).
Probably this doesn't matter at all in practice, but it
makes the code slightly less confusing to read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To generate "--keep=receive-pack $pid on $host", we write
progressively into a single buffer, which requires keeping
track of how much we've written so far. But since the result
is destined to go into our argv array, we can simply use
argv_array_pushf.
Unfortunately we still have to have a fixed-size buffer for
the gethostname() call, but at least it now doesn't involve
any extra size computation. And as a bonus, we drop an
sprintf and a strcpy call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.
But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.
It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:
- the call signature is a little bit unwieldy:
d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);
You need offsetof here instead of just writing to the
end of the base size, because we don't know how the
struct is packed (partially this is because FLEX_ARRAY
might not be zero, though we can account for that; but
the size of the struct may actually be rounded up for
alignment, and we can't know that).
- some sites do clever things, like over-allocating because
they know they will write larger things into the buffer
later (e.g., struct packed_git here).
So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This saves us some manual computation, and eliminates a call
to strcpy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our color parsing is designed to never exceed COLOR_MAXLEN
bytes. But the relationship between that hand-computed
number and the parsing code is not at all obvious, and we
merely hope that it has been computed correctly for all
cases.
Let's mark the expected "end" pointer for the destination
buffer and make sure that we do not exceed it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases where we strcpy() the result of sha1_to_hex(),
there's no need; the result goes directly into a printf
statement, and we can simply pass the return value from
sha1_to_hex() directly.
When this code was originally written, sha1_to_hex used a
single buffer, and it was not safe to use it twice within a
single expression. That changed as of dcb3450 (sha1_to_hex()
usage cleanup, 2006-05-03), but this code was never updated.
History-dug-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before sha1_to_hex_r() existed, a simple way to get hex
sha1 into a buffer was with:
strcpy(buf, sha1_to_hex(sha1));
This isn't wrong (assuming the buf is 41 characters), but it
makes auditing the code base for bad strcpy() calls harder,
as these become false positives.
Let's convert them to sha1_to_hex_r(), and likewise for
some calls to find_unique_abbrev(). While we're here, we'll
double-check that all of the buffers are correctly sized,
and use the more obvious GIT_SHA1_HEXSZ constant.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids an ugly strcat into a fixed-size buffer. It's
not wrong (the buffer is plenty large enough for an IPv6
address plus some minor formatting), but it takes some
effort to verify that.
Unfortunately we are still stuck with some fixed-size
buffers to hold the output of inet_ntop. But at least we now
pass very easy-to-verify parameters, rather than doing a
manual computation to account for other data in the buffer.
As a side effect, this also fixes the case where we might
pass an uninitialized portbuf buffer through the
environment. This probably couldn't happen in practice, as
it would mean that addr->sa_family was neither AF_INET nor
AF_INET6 (and that is all we are listening on).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to dropping the magic number for the fixed-size
argv, we can also drop a fixed-length buffer and some
strcpy's into it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This drops the magic number for the fixed-size argv arrays,
so we do not have to wonder if we are overflowing it. We can
also drop some confusing sha1_to_hex memory allocation
(which seems to predate the ring of buffers allowing
multiple calls), and get rid of an unchecked sprintf call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This cleans up a magic number that must be kept in sync with
the rest of the code (the number of argv slots). It also
lets us drop some fixed buffers and an sprintf (since we
can now use argv_array_pushf).
We do still have to keep one fixed buffer for calling
gethostname, but at least now the size computations for it
are much simpler.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We dynamically allocate a buffer and then strcpy and strcat
into it. This isn't buggy, but we'd prefer to avoid these
suspicious functions.
This would be a good candidate for converstion to xstrfmt,
but we need to record the length for dealing with index
entries. A strbuf handles that for us.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a loose object tempfile, we use a fixed
PATH_MAX-sized buffer, and strcpy directly into it. This
isn't buggy, because we do a rough check of the size, but
there's no verification that our guesstimate of the required
space is enough (in fact, it's several bytes too big for the
current naming scheme).
Let's switch to a strbuf, which makes this much easier to
verify. The allocation overhead should be negligible, since
we are replacing a static buffer with a static strbuf, and
we'll only need to allocate on the first call.
While we're here, we can also document a subtle interaction
with mkstemp that would be easy to overlook.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function strcpy's directly into a PATH_MAX-sized
buffer. There's only one caller, which feeds the git_dir into
it, so it's not easy to trigger in practice (even if you fed
a large $GIT_DIR through the environment or .git file, it
would have to actually exist and be accessible on the
filesystem to get to this point). We can fix it by moving to
a strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use two PATH_MAX-sized buffers to represent the repo
path, and must make sure not to overflow them. We do take
care to check the lengths, but the logic is rather hard to
follow, as we use several magic numbers (e.g., "PATH_MAX -
10"). And in fact you _can_ overflow the buffer if you have
a ".git" file with an extremely long path in it.
By switching to strbufs, these problems all go away. We do,
however, retain the check that the initial input we get is
no larger than PATH_MAX. This function is an entry point for
untrusted repo names from the network, and it's a good idea
to keep a sanity check (both to avoid allocating arbitrary
amounts of memory, and also as a layer of defense against
any downstream users of the names).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This would be a fairly routine use of xstrfmt, except that
we need to remember the length of the result to pass to
cache_name_pos. So just use a strbuf, which makes this
simple.
As a bonus, this gets rid of confusing references to
"pathlen+1". The "1" is for the trailing slash we added, but
that is automatically accounted for in the strbuf's len
parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generate range strings like "1234abcd...5678efab" for use
in the the fetch and push status tables. We use fixed-size
buffers along with strcat to do so. These aren't buggy, as
our manual size computation is correct, but there's nothing
checking that this is so. Let's switch them to strbufs
instead, which are obviously correct, and make it easier to
audit the code base for problematic calls to strcat().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use manual computation and strcpy to allocate the "root"
variable. This would be much simpler using xstrfmt. But
since we store the length, too, we can just use a strbuf,
which handles that for us.
Note that we stop distinguishing between "no root" and
"empty root" in some cases, but that's OK; the results are
the same (e.g., inserting an empty string is a noop).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The init code predates strbufs, and uses PATH_MAX-sized
buffers along with many manual checks on intermediate sizes
(some of which make magic assumptions, such as that init
will not create a path inside .git longer than 50
characters).
We can simplify this greatly by using strbufs, which drops
some hard-to-verify strcpy calls in favor of git_path_buf.
While we're in the area, let's also convert existing calls
to git_path to the safer git_path_buf (our existing calls
were passed to pretty tame functions, and so were not a
problem, but it's easy to be consistent and safe here).
Note that we had an explicit test that "git init" rejects
long template directories. This comes from 32d1776 (init: Do
not segfault on big GIT_TEMPLATE_DIR environment variable,
2009-04-18). We can drop the test_must_fail here, as we now
accept this and need only confirm that we don't segfault,
which was the original point of the test.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are initializing a .git directory, we may call
probe_utf8_pathname_composition to detect utf8 mangling. We
pass in a path buffer for it to use, and it blindly
strcpy()s into it, not knowing whether the buffer is large
enough to hold the result or not.
In practice this isn't a big deal, because the buffer we
pass in already contains "$GIT_DIR/config", and we append
only a few extra bytes to it. But we can easily do the right
thing just by calling git_path_buf ourselves. Technically
this results in a different pathname (before we appended our
utf8 characters to the "config" path, and now they get their
own files in $GIT_DIR), but that should not matter for our
purposes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The result of iconv is assigned to a variable, but we never
use it (instead, we check errno and whether the function
consumed all bytes). Let's drop the assignment, as it
triggers gcc's -Wunused-but-set-variable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add example implementation including test cases for the large file
system using Git LFS.
Pushing files to the Git LFS server is not tested.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce repositories can contain large (binary) files. Migrating these
repositories to Git generates very large local clones. External storage
systems such as Git LFS [1], Git Fat [2], Git Media [3], git-annex [4]
try to address this problem.
Add a generic mechanism to detect large files based on extension,
uncompressed size, and/or compressed size.
[1] https://git-lfs.github.com/
[2] https://github.com/jedbrown/git-fat
[3] https://github.com/alebedev/git-media
[4] https://git-annex.branchable.com/
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Conflicts:
Documentation/git-p4.txt
git-p4.py
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4 will just halt if there is not enough disk space while
streaming content from P4 to Git. Add a check to ensure at least
4 times (arbitrarily chosen) the size of a streamed file is available.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a file is streamed from P4 to Git then the verbose mode prints
continuously the progress as percentage like this:
//depot/file.bin 20% (10 MB)
Upon completion the progress is overwritten with depot source, local
file and size like this:
//depot/file.bin --> local/file.bin (10 MB)
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a git config reader for integer variables. Please note that the
git config implementation automatically supports k, m, and g suffixes.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions "gitConfig" and "gitConfigBool" are almost identical.
Make "gitConfig" more generic by adding an optional type specifier.
Use the type specifier "--bool" with "gitConfig" to implement
"gitConfigBool. This prepares the implementation of other type
specifiers such as "--int".
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Acked-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>