The split-index mode had a few corner case bugs fixed.
* tg/split-index-fixes:
travis: run tests with GIT_TEST_SPLIT_INDEX
split-index: don't write cache tree with null oid entries
read-cache: fix reading the shared index for other repos
Crash fix for a corner case where an error codepath tried to unlock
what it did not acquire lock on.
* mr/packed-ref-store-fix:
files_initial_transaction_commit(): only unlock if locked
The http tracing code, often used to debug connection issues,
learned to redact potentially sensitive information from its output
so that it can be more safely sharable.
* jt/http-redact-cookies:
http: support omitting data from traces
http: support cookie redaction when tracing
Avoid showing a warning message in the middle of a line of "git
diff" output.
* nd/diff-flush-before-warning:
diff.c: flush stdout before printing rename warnings
Build the executable in 'script' phase in Travis CI integration, to
follow the established practice, rather than during 'before_script'
phase. This allows the CI categorize the failures better ('failed'
is project's fault, 'errored' is build environment's).
* sg/travis-build-during-script-phase:
travis-ci: build Git during the 'script' phase
Fix for a commented-out code to adjust it to a rather old API change.
* ys/bisect-object-id-missing-conversion-fix:
bisect: debug: convert struct object to object_id
When resetting the working tree files recursively, the working tree
of submodules are now also reset to match.
* sb/submodule-update-reset-fix:
submodule: submodule_move_head omits old argument in forced case
unpack-trees: oneway_merge to update submodules
t/lib-submodule-update.sh: fix test ignoring ignored files in submodules
t/lib-submodule-update.sh: clarify test
"git commit --fixup" did not allow "-m<message>" option to be used
at the same time; allow it to annotate resulting commit with more
text.
* ab/commit-m-with-fixup:
commit: add support for --fixup <commit> -m"<extra message>"
commit doc: document that -c, -C, -F and --fixup with -m error
"git status" after moving a path in the working tree (hence making
it appear "removed") and then adding with the -N option (hence
making that appear "added") detected it as a rename, but did not
report the old and new pathnames correctly.
* nd/ita-wt-renames-in-status:
wt-status.c: handle worktree renames
wt-status.c: rename rename-related fields in wt_status_change_data
wt-status.c: catch unhandled diff status codes
wt-status.c: coding style fix
Use DIFF_DETECT_RENAME for detect_rename assignments
t2203: test status output with porcelain v2 format
"git add -p" was taught to ignore local changes to submodules as
they do not interfere with the partial addition of regular changes
anyway.
* nd/add-i-ignore-submodules:
add--interactive: ignore submodule changes except HEAD
"git stash -- <pathspec>" incorrectly blew away untracked files in
the directory that matched the pathspec, which has been corrected.
* tg/stash-with-pathspec-fix:
stash: don't delete untracked files that match pathspec
"git clone $there $here" is allowed even when here directory exists
as long as it is an empty directory, but the command incorrectly
removed it upon a failure of the operation.
* jk/abort-clone-with-existing-dest:
clone: do not clean up directories we didn't create
clone: factor out dir_exists() helper
t5600: modernize style
t5600: fix outdated comment about unborn HEAD
"git merge -Xours/-Xtheirs" learned to use our/their version when
resolving a conflicting updates to a symbolic link.
* jc/merge-symlink-ours-theirs:
merge: teach -Xours/-Xtheirs to symbolic link merge
API clean-up around revision traversal.
* rs/lose-leak-pending:
commit: remove unused function clear_commit_marks_for_object_array()
revision: remove the unused flag leak_pending
checkout: avoid using the rev_info flag leak_pending
bundle: avoid using the rev_info flag leak_pending
bisect: avoid using the rev_info flag leak_pending
object: add clear_commit_marks_all()
ref-filter: use clear_commit_marks_many() in do_merge_filter()
commit: use clear_commit_marks_many() in remove_redundant()
commit: avoid allocation in clear_commit_marks_many()
"git svn dcommit" did not take into account the fact that a
svn+ssh:// URL with a username@ (typically used for pushing) refers
to the same SVN repository without the username@ and failed when
svn.pushmergeinfo option is set.
* jm/svn-pushmergeinfo-fix:
git-svn: fix svn.pushmergeinfo handling of svn+ssh usernames.
An old regression in "git describe --all $annotated_tag^0" has been
fixed.
* dk/describe-all-output-fix:
describe: prepend "tags/" when describing tags with embedded name
When git-daemon gets a pktline request, we strip off any
trailing newline, replacing it with a NUL. Clients prior to
5ad312bede (in git v1.4.0) would send:
git-upload-pack repo.git\n
and we need to strip it off to understand their request.
After 5ad312bede, we send the host attribute but no newline,
like:
git-upload-pack repo.git\0host=example.com\0
Both of these are parsed correctly by git-daemon. But if
some client were to combine the two:
git-upload-pack repo.git\n\0host=example.com\0
we don't parse it correctly. The problem is that we use the
"len" variable to record the position of the NUL separator,
but then decrement it when we strip the newline. So we start
with:
git-upload-pack repo.git\n\0host=example.com\0
^-- len
and end up with:
git-upload-pack repo.git\0\0host=example.com\0
^-- len
This is arguably correct, since "len" tells us the length of
the initial string, but we don't actually use it for that.
What we do use it for is finding the offset of the extended
attributes; they used to be at len+1, but are now at len+2.
We can solve that by just leaving "len" where it is. We
don't have to care about the length of the shortened string,
since we just treat it like a C string.
No version of Git ever produced such a string, but it seems
like the daemon code meant to handle this case (and it seems
like a reasonable thing for somebody to do in a 3rd-party
implementation).
Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of our git-protocol tests rely on invoking the client
and having it make a request of a server. That gives a nice
real-world test of how the two behave together, but it
doesn't leave any room for testing how a server might react
to _other_ clients.
Let's add a few test helper functions which can be used to
manually conduct a git-protocol conversation with a remote
git-daemon:
1. To connect to a remote git-daemon, we need something
like "netcat". But not everybody will have netcat. And
even if they do, the behavior with respect to
half-duplex shutdowns is not portable (openbsd netcat
has "-N", with others you must rely on "-q 1", which is
racy).
Here we provide a "fake_nc" that is capable of doing
a client-side netcat, with sane half-duplex semantics.
It relies on perl's IO::Socket::INET. That's been in
the base distribution since 5.6.0, so it's probably
available everywhere. But just to be on the safe side,
we'll add a prereq.
2. To help tests speak and read pktline, this patch adds
packetize() and depacketize() functions.
I've put fake_nc() into lib-git-daemon.sh, since that's
really the only server where we'd need to use a network
socket. Whereas the pktline helpers may be of more general
use, so I've added them to test-lib-functions.sh. Programs
like upload-pack speak pktline, but can talk directly over
stdio without a network socket.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we receive a request with extended attributes after the
NUL, we try to write those attributes to the log. We do so
with a "%s" format specifier, which will only show
characters up to the first NUL.
That's enough for printing a "host=" specifier. But since
dfe422d04d (daemon: recognize hidden request arguments,
2017-10-16) we may have another NUL, followed by protocol
parameters, and those are not logged at all.
Let's cut out the attempt to show the whole string, and
instead log when we parse individual attributes. We could
leave the "extended attributes (%d bytes) exist" part of the
log, which in theory could alert us to attributes that fail
to parse. But anything we don't parse as a "host=" parameter
gets blindly added to the "protocol" attribute, so we'd see
it in that part of the log.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If receive a request like:
git-upload-pack /foo.git\0host=localhost
we mark the offset of the NUL byte as "len", and then log
the bytes after the NUL with a "%.*s" placeholder, using
"pktlen - len" as the length, and "line + len + 1" as the
start of the string.
This is off-by-one, since the start of the string skips past
the separating NUL byte, but the adjusted length includes
it. Fortunately this doesn't actually read past the end of
the buffer, since "%.*s" will stop when it hits a NUL. And
regardless of what is in the buffer, packet_read() will
always add an extra NUL terminator for safety.
As an aside, the git.git client sends an extra NUL after a
"host" field, too, so we'd generally hit that one first, not
the one added by packet_read(). You can see this in the test
output which reports 15 bytes, even though the string has
only 14 bytes of visible data. But the point is that even a
client sending unusual data could not get us to read past
the end of the buffer, so this is purely a cosmetic fix.
Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we start git-daemon for our tests, we send its stderr
log stream to a named pipe. We synchronously read the first
line to make sure that the daemon started, and then dump the
rest to descriptor 4. This is handy for debugging test
output with "--verbose", but the tests themselves can't
access the log data.
Let's dump the log into a file, as well, so that future
tests can check the log. There are a few subtleties worth
calling out here:
- we'll continue to send output to descriptor 4 for
viewing/debugging, which would imply swapping out "cat"
for "tee". But we want to ensure that there's no
buffering, and "tee" doesn't have a standard way to
ask for that. So we'll use a shell loop around "read"
and "printf" instead. That ensures that after a request
has been served, the matching log entries will have made
it to the file.
- the existing first-line shell loop used read/echo. We'll
switch to consistently using "read -r" and "printf" to
relay data as faithfully as possible.
- we open the logfile for append, rather than just output.
That makes it OK for tests to truncate the logfile
without restarting the daemon (the OS will atomically
seek to the end of the file when outputting each line).
That allows tests to look at the log without worrying
about pollution from earlier tests.
Helped-by: Lucas Werkmeister <mail@lucaswerkmeister.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't actually care about the clone operation here; we
just want to know if we were able to actually contact the
remote repository. Using ls-remote does that more
efficiently, and without us having to worry about managing
the tmp.git directory.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recently introduced regression caused a segfault at clone time on
case-insensitive filesystems when filenames differing only in case are
present. This bug has already been fixed (repository: pre-initialize
hash algo pointer, 2018-01-18), but it's not the first time similar
problems have arisen. Therefore, introduce a test to catch this case and
protect against future regressions.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are various git subcommands (among them, clone) which don't set up
the repository (that is, they lack RUN_SETUP or RUN_SETUP_GENTLY) but
end up needing to have information about the hash algorithm in use.
Because the hash algorithm is part of struct repository and it's only
initialized in repository setup, we can end up dereferencing a NULL
pointer in some cases if we call one of these subcommands and look up
the empty blob or empty tree values.
A "git clone" of a project that has two paths that differ only in
case suffers from this if it is run on a case insensitive platform.
When the command attempts to check out one of these two paths after
checking out the other one, the checkout codepath needs to see if
the version that is already on the filesystem (which should not
happen if the FS were case sensitive) is dirty, and it needs to
exercise the hashing code at that point.
In the future, we can add a command line option for this or read it
from the configuration, but until we're ready to expose that
functionality to the user, simply initialize the repository
structure to use the current hash algorithm, SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running git clone --single-branch --mirror -b TAGNAME previously
triggered the following error message:
fatal: multiple updates for ref 'refs/tags/TAGNAME' not allowed.
This error condition is handled in files_initial_transaction_commit().
42c7f7ff9 ("commit_packed_refs(): remove call to `packed_refs_unlock()`", 2017-06-23)
introduced incorrect unlocking in the error path of this function,
which changes the error message to
fatal: BUG: packed_refs_unlock() called when not locked
Move the call to packed_refs_unlock() above the "cleanup:" label
since the unlocking should only be done in the last error path.
Signed-off-by: Mathias Rav <m@git.strova.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GIT_TRACE_CURL provides a way to debug what is being sent and received
over HTTP, with automatic redaction of sensitive information. But it
also logs data transmissions, which significantly increases the log file
size, sometimes unnecessarily. Add an option "GIT_TRACE_CURL_NO_DATA" to
allow the user to omit such data transmissions.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using GIT_TRACE_CURL, Git already redacts the "Authorization:" and
"Proxy-Authorization:" HTTP headers. Extend this redaction to a
user-specified list of cookies, specified through the
"GIT_REDACT_COOKIES" environment variable.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split index mode only has a few dedicated tests, but as the index is
involved in nearly every git operation, this doesn't quite cover all the
ways repositories with split index can break. To use split index mode
throughout the test suite a GIT_TEST_SPLIT_INDEX environment variable
can be set, which makes git split the index at random and thus
excercises the functionality much more thoroughly.
As this is not turned on by default, it is not executed nearly as often
as the test suite is run, so occationally breakages slip through. Try
to counteract that by running the test suite with GIT_TEST_SPLIT_INDEX
mode turned on on travis.
To avoid using too many cycles on travis only run split index mode in
the linux-gcc target only. The Linux build was chosen over the Mac OS
builds because it tends to be much faster to complete.
The linux gcc build was chosen over the linux clang build because the
linux clang build is the fastest build, so it can serve as an early
indicator if something is broken and we want to avoid spending the extra
cycles of running the test suite twice for that.
Helped-by: Lars Schneider <larsxschneider@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a96d3cc3f6 ("cache-tree: reject entries with null sha1", 2017-04-21)
we made sure that broken cache entries do not get propagated to new
trees. Part of that was making sure not to re-use an existing cache
tree that includes a null oid.
It did so by dropping the cache tree in 'do_write_index()' if one of
the entries contains a null oid. In split index mode however, there
are two invocations to 'do_write_index()', one for the shared index
and one for the split index. The cache tree is only written once, to
the split index.
As we only loop through the elements that are effectively being
written by the current invocation, that may not include the entry with
a null oid in the split index (when it is already written to the
shared index), where we write the cache tree. Therefore in split
index mode we may still end up writing the cache tree, even though
there is an entry with a null oid in the index.
Fix this by checking for null oids in prepare_to_write_split_index,
where we loop the entries of the shared index as well as the entries for
the split index.
This fixes t7009 with GIT_TEST_SPLIT_INDEX. Also add a new test that's
more specifically showing the problem.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_index_from() takes a path argument for the location of the index
file. For reading the shared index in split index mode however it just
ignores that path argument, and reads it from the gitdir of the current
repository.
This works as long as an index in the_repository is read. Once that
changes, such as when we read the index of a submodule, or of a
different working tree than the current one, the gitdir of
the_repository will no longer contain the appropriate shared index,
and git will fail to read it.
For example t3007-ls-files-recurse-submodules.sh was broken with
GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
broken in a similar manner, probably by introducing struct repository
there, although I didn't track down the exact commit for that.
be489d02d2 ("revision.c: --indexed-objects add objects from all
worktrees", 2017-08-23) breaks with split index mode in a similar
manner, not erroring out when it can't read the index, but instead
carrying on with pruning, without taking the index of the worktree into
account.
Fix this by passing an additional gitdir parameter to read_index_from,
to indicate where it should look for and read the shared index from.
read_cache_from() defaults to using the gitdir of the_repository. As it
is mostly a convenience macro, having to pass get_git_dir() for every
call seems overkill, and if necessary users can have more control by
using read_index_from().
Helped-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff output is buffered in a FILE object and could still be
partially buffered when we print these warnings (directly to fd 2).
The output is messed up like this
worktree.c | 138 +-
worktree.h warning: inexact rename detection was skipped due to too many files.
| 12 +-
wrapper.c | 83 +-
It gets worse if the warning is printed after color codes for the graph
part are already printed. You'll get a warning in green or red.
Flush stdout first, so we can get something like this instead:
xdiff/xutils.c | 42 +-
xdiff/xutils.h | 4 +-
1033 files changed, 150824 insertions(+), 69395 deletions(-)
warning: inexact rename detection was skipped due to too many files.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For 'add -i' and 'add -p', the only action we can take on a dirty
submodule entry is update the index with a new value from its HEAD. The
content changes inside (from its own index, untracked files...) do not
matter, at least until 'git add -i' learns about launching a new
interactive add session inside a submodule.
Ignore all other submodules changes except HEAD. This reduces the number
of entries the user has to check through in 'git add -i', and the number
of 'no' they have to answer to 'git add -p' when dirty submodules are
present.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>