Commit 9472935d81 (add: introduce "--renormalize", 2017-11-16) taught
git-add to pass HASH_RENORMALIZE to add_to_index(), which then passes
the flag along to index_path(). However, the flags taken by
add_to_index() and the ones taken by index_path() are distinct
namespaces. We cannot take HASH_* flags in add_to_index(), because they
overlap with the ADD_CACHE_* flags we already take (in this case,
HASH_RENORMALIZE conflicts with ADD_CACHE_IGNORE_ERRORS).
We can solve this by adding a new ADD_CACHE_RENORMALIZE flag, and using
it to set HASH_RENORMALIZE within add_to_index(). In order to make it
clear that these two flags come from distinct sets, let's also change
the name "newflags" in the function to "hash_flags".
Reported-by: Dmitriy Smirnov <dmitriy.smirnov@jetbrains.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git determines whether a file has changed, it looks at the mtime,
at the file size, and to detect changes even if the mtime is the same
(on Windows, the mtime granularity is 100ns, read: if two files are
written within the same 100ns time slot, they have the same mtime) and
even if the file size is the same, Git also looks at the inode/device
numbers.
This design obviously comes from a Linux background, where `lstat()`
calls were designed to be cheap.
On Windows, there is no `lstat()`. It has to be emulated. And while
obtaining the mtime and the file size is not all that expensive (you can
get both with a single `GetFileAttributesW()` call), obtaining the
equivalent of the inode and device numbers is very expensive (it
requires a call to `GetFileInformationByHandle()`, which in turn
requires a file handle, which is *a lot* more expensive than one might
imagine).
As it is very uncommon for developers to modify files within 100ns time
slots, Git for Windows chooses not to fill inode/device numbers
properly, but simply sets them to 0.
However, in t6042 the files file_v1 and file_v2 are typically written
within the same 100ns time slot, and they do not differ in file size. So
the minor modification is not picked up.
Let's work around this issue by avoiding the `git mv` calls in the
'mod6-setup: chains of rename/rename(1to2) and rename/rename(2to1)' test
case. The target files are overwritten anyway, so it is not like we
really rename those files. This fixes the issue because `git add` will
now add the files as new files (as opposed to existing, just renamed
files).
Functionally, we do not change anything because we replace two `git mv
<old> <new>` calls (where `<new>` is completely overwritten and `git
add`ed later anyway) by `git rm <old>` calls (removing other files, too,
that are also completely overwritten and `git add`ed later).
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define a GIT_TEST_SIDEBAND_ALL environment variable meant to be used
from tests. When set to true, this overrides uploadpack.allowsidebandall
to true, allowing the entire test suite to be run as if this
configuration is in place for all repositories.
As of this patch, all tests pass whether GIT_TEST_SIDEBAND_ALL is unset
or set to 1.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, a response to a fetch request has sideband support only while
the packfile is being sent, meaning that the server cannot send notices
until the start of the packfile.
Extend sideband support in protocol v2 fetch responses to the whole
response. upload-pack will advertise it if the
uploadpack.allowsidebandall configuration variable is set, and
fetch-pack will automatically request it if advertised.
If the sideband is to be used throughout the whole response, upload-pack
will use it to send errors instead of prefixing a PKT-LINE payload with
"ERR ".
This will be tested in a subsequent patch.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent patch will teach struct packet_reader a new field that, if
set, instructs it to interpret read data as multiplexed. This will
create a dependency from pkt-line to sideband.
To avoid a circular dependency, split recv_sideband() into 2 parts: the
reading loop (left in recv_sideband()) and the processing of the
contents (in demultiplex_sideband()), and move the former into pkt-line.
This reverses the direction of dependency: sideband no longer depends on
pkt-line, and pkt-line now depends on sideband.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GCC 8 introduced the new -Wcast-function-type warning, which is
implied by -Wextra (which, in turn is enabled in our DEVELOPER flags).
When building Git with GCC 8 and this warning enabled on a non-glibc
platform [1], one is greeted with a screenful of compiler
warnings/errors:
compat/obstack.c: In function '_obstack_begin':
compat/obstack.c:162:17: error: cast between incompatible function types from 'void * (*)(long int)' to 'struct _obstack_chunk * (*)(void *, long int)' [-Werror=cast-function-type]
h->chunkfun = (struct _obstack_chunk * (*)(void *, long)) chunkfun;
^
compat/obstack.c:163:16: error: cast between incompatible function types from 'void (*)(void *)' to 'void (*)(void *, struct _obstack_chunk *)' [-Werror=cast-function-type]
h->freefun = (void (*) (void *, struct _obstack_chunk *)) freefun;
^
compat/obstack.c:116:8: error: cast between incompatible function types from 'struct _obstack_chunk * (*)(void *, long int)' to 'struct _obstack_chunk * (*)(long int)' [-Werror=cast-function-type]
: (*(struct _obstack_chunk *(*) (long)) (h)->chunkfun) ((size)))
^
compat/obstack.c:168:22: note: in expansion of macro 'CALL_CHUNKFUN'
chunk = h->chunk = CALL_CHUNKFUN (h, h -> chunk_size);
^~~~~~~~~~~~~
<snip>
'struct obstack' stores pointers to two functions to allocate and free
"chunks", and depending on how obstack is used, these functions take
either one parameter (like standard malloc() and free() do; this is
how we use it in 'kwset.c') or two parameters. Presumably to reduce
memory footprint, a single field is used to store the function pointer
for both signatures, and then it's casted to the appropriate signature
when the function pointer is accessed. These casts between function
pointers with different number of parameters are what trigger those
compiler errors.
Modify 'struct obstack' to use unions to store function pointers with
different signatures, and then use the union member with the
appropriate signature when accessing these function pointers. This
eliminates the need for those casts, and thus avoids this compiler
error.
[1] Compiling 'compat/obstack.c' on a platform with glibc is sort of
a noop, see the comment before '# define ELIDE_CODE', so this is
not an issue on common Linux distros.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It looks like it is a copy-paste error made in 80f2a6097c
(t/helper: add test-ref-store to test ref-store functions,
2017-03-26) to pass "old-sha1" instead of "new-sha1" to
notnull() when we get the new sha1 argument from
const char **argv.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment explaining how to build the fuzzers was broken in
927c77e7d4 ("Makefile: use FUZZ_CXXFLAGS for linking fuzzers",
2018-11-14).
When building fuzzers, all .c files must be compiled with coverage
tracing enabled. This is not possible when using only FUZZ_CXXFLAGS, as
that flag is only applied to the fuzzers themselves. Switching back to
CFLAGS fixes the issue.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fuzz-commit-graph identified a case where Git will read past the end of
a buffer containing a commit graph if the graph's header has an
incorrect chunk count. A simple bounds check in parse_commit_graph()
prevents this.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Break load_commit_graph_one() into a new function, parse_commit_graph().
The latter function operates on arbitrary buffers, which makes it
suitable as a fuzzing target. Since parse_commit_graph() is only called
by load_commit_graph_one() (and the fuzzer described below), we omit
error messages that would be duplicated by the caller.
Adds fuzz-commit-graph.c, which provides a fuzzing entry point
compatible with libFuzzer (and possibly other fuzzing engines).
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When communicating with a remote server or a subprocess, use
expanded numbers rather than numbers with scaling suffix in the
object filter spec (e.g. "limit:blob=1k" becomes
"limit:blob=1024").
Update the protocol docs to note that clients should always perform this
expansion, to allow for more compatibility between server
implementations.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a tree has already been recorded as omitted, we don't need to
traverse it again just to collect its omits. Stop traversing trees a
second time when collecting omits.
Signed-off-by: Matthew DeVore <matvore@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement positive values for <depth> in the tree:<depth> filter. The
exact semantics are described in Documentation/rev-list-options.txt.
The long-term goal at the end of this is to allow a partial clone to
eagerly fetch an entire directory of files by fetching a tree and
specifying <depth>=1. This, for instance, would make a build operation
fast and convenient. It is fast because the partial clone does not need
to fetch each file individually, and convenient because the user does
not need to supply a sparse-checkout specification.
Another way of considering this feature is as a way to reduce
round-trips, since the client can get any number of levels of
directories in a single request, rather than wait for each level of tree
objects to come back, whose entries are used to construct a new request.
Signed-off-by: Matthew DeVore <matvore@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future patch will allow the client to request multiplexing of the
entire fetch response (and not only during packfile transmission), which
in turn allows the server to send progress and keepalive messages at any
time during the response.
It will be convenient for a future patch if writing options
(specifically, whether the written data is to be multiplexed) could be
controlled from a single place, so create struct packet_writer to serve
as that place, and modify upload-pack to use it.
Currently, it only stores the output fd, but a subsequent patch will (as
described above) introduce an option to determine if the written data is
to be multiplexed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If there is already a .git/ADD_EDIT.patch file, we fail to truncate it
properly, which could result in very funny errors.
Of course, this file should not be left lying around. But at least in
one case, there was a stale copy, larger than the current diff. So the
result was a corrupt diff.
Let's just truncate the file when we write it and not worry about it too
much.
Reported by J Wyman.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are some situations in which we want to store an object ID into
struct object_id without the_hash_algo necessarily being set correctly.
One such case is when cloning a repository, where we must read refs from
the remote side without having a repository from which to read the
preferred algorithm.
In this cases, we may have the_hash_algo set to SHA-1, which is the
default, but read refs into struct object_id that are SHA-256. When
copying these values, we will want to copy them completely, not just the
first 20 bytes. Consequently, make sure that oidcpy copies the maximum
number of bytes at all times, regardless of the setting of
the_hash_algo.
Since oidcpy and hashcpy are no longer functionally identical, remove
the Cocinelle object_id transformations that convert from one into the
other.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing a tree, we read the object ID directly out of the tree
buffer. This is normally fine, but such an object ID cannot be used with
oidcpy, which copies GIT_MAX_RAWSZ bytes, because if we are using SHA-1,
there may not be that many bytes to copy.
Instead, store the object ID in a separate struct member. Since we can
no longer efficiently compute the path length, store that information as
well in struct name_entry. Ensure we only copy the object ID into the
new buffer if the path length is nonzero, as some callers will pass us
an empty path with no object ID following it, and we will not want to
read past the end of the buffer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we splice trees together, we operate in place on the tree buffer.
If we're using SHA-1 for the hash algorithm, we may not have a full
GIT_MAX_RAWSZ (32) bytes to copy. Consequently, it doesn't logically
make sense for us to use a struct object_id to represent this type,
since it isn't a complete object.
Represent this value as a unsigned char pointer instead and copy it when
necessary.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the struct object_id pointer returned from tree_entry_extract
lives directly inside the parsed tree buffer. In a future commit, this
will change so that it instead points to a dedicated struct member.
Since in this code path, we want to modify the buffer directly, compute
the buffer offset we want to modify by using the pointer to the path
instead.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, the pointer returned by tree_entry_extract will
point into the struct tree_desc, causing its lifetime to be bound to
that of the struct tree_desc itself. To ensure this code path keeps
working, copy the object_id into a local variable so that it lives long
enough.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log -G<regex>" looked for a hunk in the "git log -p" patch
output that contained a string that matches the given pattern.
Optimize this code to ignore binary files, which by default will
not show any hunk that would match any pattern (unless textconv or
the --text option is in effect, that is).
* tb/log-G-binary:
log -G: ignore binary files
Lines that begin with a certain keyword that come over the wire, as
well as lines that consist only of one of these keywords, ought to
be painted in color for easier eyeballing, but the latter was
broken ever since the feature was introduced in 2.19, which has
been corrected.
* hn/highlight-sideband-keywords:
sideband: color lines with keyword only
"git checkout [<tree-ish>] path..." learned to report the number of
paths that have been checked out of the index or the tree-ish,
which gives it the same degree of noisy-ness as the case in which
the command checks out a branch.
* nd/checkout-noisy:
t0027: squelch checkout path run outside test_expect_* block
checkout: print something when checking out paths
The traversal over tree objects has learned to honor
":(attr:label)" pathspec match, which has been implemented only for
enumerating paths on the filesystem.
* nd/attr-pathspec-in-tree-walk:
tree-walk: support :(attr) matching
dir.c: move, rename and export match_attrs()
pathspec.h: clean up "extern" in function declarations
tree-walk.c: make tree_entry_interesting() take an index
tree.c: make read_tree*() take 'struct repository *'
"git rev-list --exclude-promisor-objects" had to take an object
that does not exist locally (and is lazily available) from the
command line without barfing, but the code dereferenced NULL.
* md/list-lazy-objects-fix:
list-objects.c: don't segfault for missing cmdline objects
This kills the_index dependency in get_oid_with_context() but for
get_oid() and friends, they still assume the_repository (which also
means the_index).
Unfortunately the widespread use of get_oid() will make it hard to
make the conversion now. We probably will add repo_get_oid() at some
point and limit the use of get_oid() in builtin/ instead of forcing
all get_oid() call sites to carry struct repository.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_index() shares the same problem as hold_locked_index(): it
assumes $GIT_DIR/index. Move all call sites to repo_read_index()
instead. read_index_preload() and read_index_unmerged() are also
killed as a consequence.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hold_locked_index() assumes the index path at $GIT_DIR/index. This is
not good for places that take an arbitrary index_state instead of
the_index, which is basically everywhere except builtin/.
Replace it with repo_hold_locked_index(). hold_locked_index() remains
as a wrapper around repo_hold_locked_index() to reduce changes in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>