Because git's object format requires us to specify the
number of bytes in the object in its header, we must know
the size before streaming a blob into the object database.
This is not a problem when adding a regular file, as we can
get the size from stat(). However, when filters are in use
(such as autocrlf, or the ident, filter, or eol
gitattributes), we have no idea what the ultimate size will
be.
The current code just punts on the whole issue and ignores
filter configuration entirely for files larger than
core.bigfilethreshold. This can generate confusing results
if you use filters for large binary files, as the filter
will suddenly stop working as the file goes over a certain
size. Rather than try to handle unknown input sizes with
streaming, this patch just turns off the streaming
optimization when filters are in use.
This has a slight performance regression in a very specific
case: if you have autocrlf on, but no gitattributes, a large
binary file will avoid the streaming code path because we
don't know beforehand whether it will need conversion or
not. But if you are handling large binary files, you should
be marking them as such via attributes (or at least not
using autocrlf, and instead marking your text files as
such). And the flip side is that if you have a large
_non_-binary file, there is a correctness improvement;
before we did not apply the conversion at all.
The first half of the new t1051 script covers these failures
on input. The second half tests the matching output code
paths. These already work correctly, and do not need any
adjustment.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call convert_to_git in dry-run mode, it may still
want to look at the source buffer, because some CRLF
conversion modes depend on analyzing the source to determine
whether it is in fact convertible CRLF text.
However, the main motivation for convert_to_git's dry-run
mode is that we would decide which method to use to acquire
the blob's data (streaming versus in-core). Requiring this
source analysis creates a chicken-and-egg problem. We are
better off simply guessing that anything we can't analyze
will end up needing conversion.
This patch lets a caller specify a NULL src buffer when
using dry-run mode (and only dry-run mode). A non-zero
return value goes from "we would convert" to "we might
convert"; a zero return value remains "we would definitely
not convert".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some callers may want to know whether convert_to_git will
actually do anything before performing the conversion
itself (e.g., to decide whether to stream or handle blobs
in-core). This patch lets callers specify the dry run mode
by passing a NULL destination buffer. The return value,
instead of indicating whether conversion happened, will
indicate whether conversion would occur.
For readability, we also include a wrapper function which
makes it more obvious we are not actually performing the
conversion.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test did not adhere to the current style on several counts:
. empty lines around the test blocks, but within the test string
. ': > file' or even just '> file' with an extra space
. inconsistent indentation
. hand-rolled commits instead of using test_commit
Fix all of them.
There's a catch to the last point: test_commit creates a tag, which the
original test did not create. We still change it to test_commit, and
explicitly delete the tags, so as to highlight that the test relies on not
having them.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify strbuf_getline() documentation, and add the missing documentation
for strbuf_getwholeline() and strbuf_getwholeline_fd().
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was indeed not obvious for new contributors to find this document in
the source tree, since there were no reference to it outside the
Documentation/ directory.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was part of the "branch description" feature in the larger
"help people communicate better during their pull based workflow"
topic, but was never documented.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git spawns a remote helper program (like git-remote-http),
the last thing we do before closing the pipe to the child
process is to send a blank line, telling the helper that we
are done issuing commands. However, the helper may already
have exited, in which case the parent git process will
receive SIGPIPE and die.
In particular, this can happen with the remote-curl helper
when it encounters errors during a push. The helper reports
individual errors for each ref back to git-push, and then
exits with a non-zero exit code. Depending on the exact
timing of the write, the parent process may or may not
receive SIGPIPE.
This causes intermittent test failure in t5541.8, and is a
side effect of 5238cbf (remote-curl: Fix push status report
when all branches fail). Before that commit, remote-curl
would not send the final blank line to indicate that the
list of status lines was complete; it would just exit,
closing the pipe. The parent git-push would notice the
closed pipe while reading the status report and exit
immediately itself, propagating the failing exit code. But
post-5238cbf, remote-curl completes the status list before
exiting, git-push actually runs to completion, and then it
tries to cleanly disconnect the helper, leading to the
SIGPIPE race above.
This patch drops all error-checking when sending the final
"we are about to hang up" blank line to helpers. There is
nothing useful for the parent process to do about errors at
that point anyway, and certainly failing to send our "we are
done with commands" line to a helper that has already exited
is not a problem.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first part of the bundle header contains the boundary commits, and
could be approximated by
# v2 git bundle
$(git rev-list --pretty=oneline --boundary <ARGS> | grep ^-)
git-bundle actually spawns exactly this rev-list invocation, and does
the grepping internally.
There was a subtle bug in the latter step: it used fgets() with a
1024-byte buffer. If the user has sufficiently long subjects (e.g.,
by not adhering to the git oneline-subject convention in the first
place), the 'oneline' format can easily overflow the buffer. fgets()
then returns the rest of the line in the next call(s). If one of
these remaining parts started with '-', git-bundle would mistakenly
insert it into the bundle thinking it was a boundary commit.
Fix it by using strbuf_getwholeline() instead, which handles arbitrary
line lengths correctly.
Note that on the receiving side in parse_bundle_header() we were
already using strbuf_getwholeline_fd(), so that part is safe.
Reported-by: Jannis Pohlmann <jannis.pohlmann@codethink.co.uk>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment even said that it should eventually go there. While at
it, match the calling convention and name of the function to the
strbuf_get*line family. So it now is strbuf_getwholeline_fd.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --ff-only option was not described next to --ff and --no-ff options in
"git merge" documentation, even though these three are logically together,
describing how to choose one of three possibilities.
Also the description for '--ff' and '--no-ff' discussed what '--ff' means,
and mentioned '--no-ff' as if it were a side-note to '--ff'.
Make them into three top-level entries and list them together. This way,
it would be more clear that the user can choose one from these three.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CodingGuidlines confidently declares "We use tabs for indentation."
It would be a shame if it were caught lying.
Signed-off-by: Philip Jägenstedt <philip@foolip.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was out-of-sync with the reality of who works on this
script. Defer (silently) to Documentation/SubmittingPatches
like all other code.
Signed-off-by: Philip Jägenstedt <philip@foolip.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/grep-binary-attribute:
grep: pre-load userdiff drivers when threaded
grep: load file data after checking binary-ness
grep: respect diff attributes for binary-ness
grep: cache userdiff_driver in grep_source
grep: drop grep_buffer's "name" parameter
convert git-grep to use grep_source interface
grep: refactor the concept of "grep source" into an object
grep: move sha1-reading mutex into low-level code
grep: make locking flag global
* nd/find-pack-entry-recent-cache-invalidation:
find_pack_entry(): do not keep packed_git pointer locally
sha1_file.c: move the core logic of find_pack_entry() into fill_pack_entry()
* fc/zsh-completion:
completion: simplify __gitcomp and __gitcomp_nl implementations
completion: use ls -1 instead of rolling a loop to do that ourselves
completion: work around zsh option propagation bug
If a filter is not defined or if it fails, git should behave as if the
filter is a no-op passthru.
However, if the filter exits before reading all the content, depending on
the timing, git could be killed with SIGPIPE when it tries to write to the
pipe connected to the filter.
Ignore SIGPIPE while processing the filter to give us a chance to check
the return value from a failed write, in order to detect and act on this
mode of failure in a more controlled way.
Signed-off-by: Jehan Bing <jehan@orb.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the argument for `__git_ps1` begins with a dash, `printf` tries to
interpret it as an option which results in an error message.
The problem is solved by adding '--' before the argument to tell
`printf` to not interpret the following argument as an option.
Adding '--' directly to the argument does not help because the argument
is enclosed by double quotes.
Signed-off-by: Christian Hammerl <info@christian-hammerl.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current configure script uses -lintl if gettext is not found in the C
library, but does so before checking if there is libintl.h available in
the first place, in which case we would later define NO_GETTEXT.
Instead, check for the existence of libintl.h first. Only when libintl.h
exists and libintl is not in libc, ask for -lintl.
Signed-off-by: John Szakmeister <john@szakmeister.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The canonical order of command line arguments is always to have dashed
commands before other parameters, but the "git remote set-branches"
subcommand was described to take "name" before an optional "--add".
Signed-off-by: Philip Jägenstedt <philip@foolip.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit ff7f218 (gitweb: Fix file links in "grep" search, 2012-01-05),
added $file_href variable, to reduce duplication and have the fix
applied in single place.
Unfortunately it made variable defined inside the loop, not taking into
account the fact that $file_href was set only if file changed.
Therefore for files with multiple matches $file_href was undefined for
second and subsequent matches.
Fix this bug by moving $file_href declaration outside loop.
Adds tests for almost all forms of sarch in gitweb, which were missing
from testuite. Note that it only tests if there are no warnings, and
it doesn't check that gitweb finds what it should find.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running "git add --refresh <pathspec>", we incorrectly showed the
path that is unmerged even if it is outside the specified pathspec, even
though we did honor pathspec and refreshed only the paths that matched.
Note that this cange does not affect "git update-index --refresh"; for
hysterical raisins, it does not take a pathspec (it takes real paths) and
more importantly itss command line options are parsed and executed one by
one as they are encountered, so "git update-index --refresh foo" means
"first refresh the index, and then update the entry 'foo' by hashing the
contents in file 'foo'", not "refresh only entry 'foo'".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a repository whose HEAD points to an unborn branch with no commits,
"heads" view and "summary" view (which shows what is shown in "heads"
view) compared the object names of commits at the tip of branches with the
output from "git rev-parse HEAD", which caused comparison of a string with
undef and resulted in a warning in the server log.
This can happen if non-bare repository (with default 'master' branch)
is updated not via committing but by other means like push to it, or
Gerrit. It can happen also just after running "git checkout --orphan
<new branch>" but before creating any new commit on this branch.
Rewrite the comparison so that it also works when $head points at nothing;
in such a case, no branch can be "the current branch", add a test for it.
While at it, rename local variable $head to $head_at, as it points to
current commit rather than current branch name (HEAD contents).
The code still incorrectly shows all branches that point at the same
commit as what HEAD points as "the current branch", even when HEAD is
detached. Fixing this bug is outside the scope of this patch.
Reported-by: Rajesh Boyapati
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The line was extended in 2dd8c3 ('git: add --info-path and --man-path
options'), and the formatted man output stopped fitting into the 80
column window.
Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit 3ed74e6 (diff --stat: ensure at least one '-' for deletions,
and one '+' for additions, 2006-09-28) improved the output for files with
tiny modifications, we accidentally broke the logic to ensure that two
equal sized changes are shown with the bars of the same length, even when
rounding errors exist.
Compute the length of the graph bars, using the same "non-zero changes is
shown with at least one column" scaling logic, but by scaling the sum of
additions and deletions to come up with the total length of the bar (this
ensures that two equal sized changes result in bars of the same length),
and then scaling the smaller of the additions or deletions. The other side
is computed as the difference between the two.
This makes the apportioning between additions and deletions less accurate
due to rounding errors, but it is much less noticeable than two files with
the same amount of change showing bars of different length.
Signed-off-by: Junio C Hamano <gitster@pobox.com>