It can sometimes be useful to examine all objects in the
repository. Normally this is done with "git rev-list --all
--objects", but:
1. That shows only reachable objects. You may want to look
at all available objects.
2. It's slow. We actually open each object to walk the
graph. If your operation is OK with seeing unreachable
objects, it's an order of magnitude faster to just
enumerate the loose directories and pack indices.
You can do this yourself using "ls" and "git show-index",
but it's non-obvious. This patch adds an option to
"cat-file --batch-check" to operate on all available
objects (rather than reading names from stdin).
This is based on a proposal by Charles Bailey to provide a
separate "git list-all-objects" command. That is more
orthogonal, as it splits enumerating the objects from
getting information about them. However, in practice you
will either:
a. Feed the list of objects directly into cat-file anyway,
so you can find out information about them. Keeping it
in a single process is more efficient.
b. Ask the listing process to start telling you more
information about the objects, in which case you will
reinvent cat-file's batch-check formatter.
Adding a cat-file option is simple and efficient. And if you
really do want just the object names, you can always do:
git cat-file --batch-check='%(objectname)' --batch-all-objects
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use a direct write() to output the results of --batch and
--batch-check. This is good for processes feeding the input
and reading the output interactively, but it introduces
measurable overhead if you do not want this feature. For
example, on linux.git:
$ git rev-list --objects --all | cut -d' ' -f1 >objects
$ time git cat-file --batch-check='%(objectsize)' \
<objects >/dev/null
real 0m5.440s
user 0m5.060s
sys 0m0.384s
This patch adds an option to use regular stdio buffering:
$ time git cat-file --batch-check='%(objectsize)' \
--buffer <objects >/dev/null
real 0m4.975s
user 0m4.888s
sys 0m0.092s
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cat-file --batch(-check)" learned the "--follow-symlinks"
option that follows an in-tree symbolic link when asked about an
object via extended SHA-1 syntax, e.g. HEAD:RelNotes that points at
Documentation/RelNotes/2.5.0.txt. With the new option, the command
behaves as if HEAD:Documentation/RelNotes/2.5.0.txt was given as
input instead.
* dt/cat-file-follow-symlinks:
cat-file: add --follow-symlinks to --batch
sha1_name: get_sha1_with_context learns to follow symlinks
tree-walk: learn get_tree_entry_follow_symlinks
Communication between the HTTP server and http_backend process can
lead to a dead-lock when relaying a large ref negotiation request.
Diagnose the situation better, and mitigate it by reading such a
request first into core (to a reasonable limit).
* jk/http-backend-deadlock:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
"hash-object --literally" introduced in v2.2 was not prepared to
take a really long object type name.
* jc/hash-object:
write_sha1_file(): do not use a separate sha1[] array
t1007: add hash-object --literally tests
hash-object --literally: fix buffer overrun with extra-long object type
git-hash-object.txt: document --literally option
Teach the index to optionally remember already seen untracked files
to speed up "git status" in a working tree with tons of cruft.
* nd/untracked-cache: (24 commits)
git-status.txt: advertisement for untracked cache
untracked cache: guard and disable on system changes
mingw32: add uname()
t7063: tests for untracked cache
update-index: test the system before enabling untracked cache
update-index: manually enable or disable untracked cache
status: enable untracked cache
untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE
untracked cache: mark index dirty if untracked cache is updated
untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS
untracked cache: avoid racy timestamps
read-cache.c: split racy stat test to a separate function
untracked cache: invalidate at index addition or removal
untracked cache: load from UNTR index extension
untracked cache: save to an index extension
ewah: add convenient wrapper ewah_serialize_strbuf()
untracked cache: don't open non-existent .gitignore
untracked cache: mark what dirs should be recursed/saved
untracked cache: record/validate dir mtime and reuse cached output
untracked cache: make a wrapper around {open,read,close}dir()
...
The pull.ff configuration was supposed to override the merge.ff
configuration, but it didn't.
* pt/pull-ff-vs-merge-ff:
pull: parse pull.ff as a bool or string
pull: make pull.ff=true override merge.ff
* jk/http-backend-deadlock-2.3:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
* jk/http-backend-deadlock-2.2:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying immediately upon failing to obtain a lock, retry
after a short while with backoff.
* mh/lockfile-retry:
lock_packed_refs(): allow retries when acquiring the packed-refs lock
lockfile: allow file locking to be retried with a timeout
Various documentation mark-up fixes to make the output more
consistent in general and also make AsciiDoctor (an alternative
formatter) happier.
* jk/asciidoc-markup-fix:
doc: convert AsciiDoc {?foo} to ifdef::foo[]
doc: put example URLs and emails inside literal backticks
doc: drop backslash quoting of some curly braces
doc: convert \--option to --option
doc/add: reformat `--edit` option
doc: fix length of underlined section-title
doc: fix hanging "+"-continuation
doc: fix unquoted use of "{type}"
doc: fix misrendering due to `single quote'
A literal block in the tutorial had lines with unequal lengths to
delimit it from the rest of the document, which choke GitHub's
AsciiDoc renderer.
* jk/stripspace-asciidoctor-fix:
doc: fix unmatched code fences in git-stripspace
A literal block in the tutorial had lines with unequal lengths to
delimit it from the rest of the document, which choke GitHub's
AsciiDoc renderer.
* ja/tutorial-asciidoctor-fix:
doc: fix unmatched code fences
Introduce http.<url>.SSLCipherList configuration variable to tweak
the list of cipher suite to be used with libcURL when talking with
https:// sites.
* ls/http-ssl-cipher-list:
http: add support for specifying an SSL cipher list
Fix remaining instances where "pack-file" is used instead of
"packfile". Some places remain where we still use "pack-file",
This is the case when we explicitly refer to a file with a
".pack" extension as opposed to a data source providing a pack
data stream.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This wires the in-repo-symlink following code through to the cat-file
builtin. In the event of an out-of-repo link, cat-file will print
the link in a new format.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the "--allow-unknown-type" option to "cat-file" to allow
inspecting loose objects of an experimental or a broken type.
* kn/cat-file-literally:
t1006: add tests for git cat-file --allow-unknown-type
cat-file: teach cat-file a '--allow-unknown-type' option
cat-file: make the options mutually exclusive
sha1_file: support reading from a loose object of unknown type
"git merge FETCH_HEAD" learned that the previous "git fetch" could
be to create an Octopus merge, i.e. recording multiple branches
that are not marked as "not-for-merge"; this allows us to lose an
old style invocation "git merge <msg> HEAD $commits..." in the
implementation of "git pull" script; the old style syntax can now
be deprecated.
* jc/merge:
merge: deprecate 'git merge <message> HEAD <commit>' syntax
merge: handle FETCH_HEAD internally
merge: decide if we auto-generate the message early in collect_parents()
merge: make collect_parents() auto-generate the merge message
merge: extract prepare_merge_message() logic out
merge: narrow scope of merge_names
merge: split reduce_parents() out of collect_parents()
merge: clarify collect_parents() logic
merge: small leakfix and code simplification
merge: do not check argc to determine number of remote heads
merge: clarify "pulling into void" special case
t5520: test pulling an octopus into an unborn branch
t5520: style fixes
merge: simplify code flow
merge: test the top-level merge driver
Since b814da8 (pull: add pull.ff configuration, 2014-01-15), running
git-pull with the configuration pull.ff=false or pull.ff=only is
equivalent to passing --no-ff and --ff-only to git-merge. However, if
pull.ff=true, no switch is passed to git-merge. This leads to the
confusing behavior where pull.ff=false or pull.ff=only is able to
override merge.ff, while pull.ff=true is unable to.
Fix this by adding the --ff switch if pull.ff=true, and add a test to
catch future regressions.
Furthermore, clarify in the documentation that pull.ff overrides
merge.ff.
Signed-off-by: Paul Tan <pyokagan@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there is only one attempt to acquire any lockfile, and if
the lock is held by another process, the locking attempt fails
immediately.
This is not such a limitation for loose reference files. First, they
don't take long to rewrite. Second, most reference updates have a
known "old" value, so if another process is updating a reference at
the same moment that we are trying to lock it, then probably the
expected "old" value will not longer be valid, and the update will
fail anyway.
But these arguments do not hold for packed-refs:
* The packed-refs file can be large and take significant time to
rewrite.
* Many references are stored in a single packed-refs file, so it could
be that the other process was changing a different reference than
the one that we are interested in.
Therefore, it is much more likely for there to be spurious lock
conflicts in connection to the packed-refs file, resulting in
unnecessary command failures.
So, if the first attempt to lock the packed-refs file fails, continue
retrying for a configurable length of time before giving up. The
default timeout is 1 second.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The former seems to just be syntactic sugar for the latter.
And as it's sugar that AsciiDoctor doesn't understand, it
would be nice to avoid it. Since there are only two spots,
and the resulting source is not significantly harder to
read, it's worth doing.
Note that this does slightly affect the generated HTML (it
has an extra newline), but the rendered result for both HTML
and docbook should be the same (since the newline is not
syntactically significant there).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes sure that AsciiDoc does not turn them into links.
Regular AsciiDoc does not catch these cases, but AsciiDoctor
does treat them as links.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Text like "{foo}" triggers an AsciiDoc attribute; we have to
write "\{foo}" to suppress this. But when the "foo" is not a
syntactically valid attribute, we can skip the quoting. This
makes the source nicer to read, and looks better under
Asciidoctor. With AsciiDoc itself, this patch produces no
changes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Older versions of AsciiDoc would convert the "--" in
"--option" into an emdash. According to 565e135
(Documentation: quote double-dash for AsciiDoc, 2011-06-29),
this is fixed in AsciiDoc 8.3.0. According to bf17126, we
don't support anything older than 8.4.1 anyway, so we no
longer need to worry about quoting.
Even though this does not change the output at all, there
are a few good reasons to drop the quoting:
1. It makes the source prettier to read.
2. We don't quote consistently, which may be confusing when
reading the source.
3. Asciidoctor does not like the quoting, and renders a
literal backslash.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the other options in the list put short and long as
two separate headings.
We can also drop the backslashing of `--`. It isn't used
elsewhere and is unnecessary for modern asciidoc (plus it
confuses asciidoctor).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In AsciiDoc, it is OK to say:
this is my title
-------------------------
but AsciiDoctor is more strict. Let's match the underline to
the title (which also makes the source prettier to read).
The output from AsciiDoc is the same either way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In list content that wants to continue to a second
paragraph, the "+" continuation and subsequent paragraph
need to be left-aligned. Otherwise AsciiDoc seems to insert
only a linebreak.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Curly braces open an "attribute" in AsciiDoc; if there's no
such attribute, strange things may happen. In this case, the
unquoted "{type}" causes AsciiDoc to omit an entire line of
text from the output. We can fix it by putting the whole
phrase inside literal backticks (which also lets us get rid
of ugly backslash escaping).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
AsciiDoc misparses some text that contains a `literal`
word followed by a fancy `single quote' word, and treats
everything from the start of the literal to the end of the
quote as a single-quoted phrase.
We can work around this by switching the latter to be a
literal, as well. In the first case, this is perhaps what
was intended anyway, as it makes us consistent with the the
earlier literals in the same paragraph. In the second, the
output is arguably better, as we will format our commit
references as <code> blocks.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The asciidoctor renderer is more picky than classic asciidoc,
and insists that the start and end of a code fence be the
same size.
Found with this hacky perl script:
foreach my $fn (@ARGV) {
open(my $fh, '<', $fn);
my ($fence, $fence_lineno, $prev);
while (<$fh>) {
chomp;
if (/^----+$/) {
if ($fence_lineno) {
if ($_ ne $fence) {
print "$fn:$fence_lineno:mismatched fence: ",
length($fence), " != ", length($_), "\n";
}
$fence_lineno = undef;
}
# hacky check to avoid title-underlining
elsif ($prev eq '' || $prev eq '+') {
$fence = $_;
$fence_lineno = $.;
}
}
$prev = $_;
}
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>