Commit Graph

28 Commits

Author SHA1 Message Date
Jonathan Tan
ffb2c0fe5c index-pack: support checking objects but not links
The index-pack command currently supports the
--check-self-contained-and-connected argument, for internal use only,
that instructs it to only check for broken links and not broken objects.
For partial clones, we need the inverse, so add a --fsck-objects
argument that checks for broken objects and not broken links, also for
internal use only.

This will be used by fetch-pack in a subsequent patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 10:16:24 -07:00
Jeff King
411481be6f index-pack: add --max-input-size=<size> option
When receiving a pack-file, it can be useful to abort the
`git index-pack`, if the pack-file is too big.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-24 12:31:05 -07:00
Nguyễn Thái Ngọc Duy
c6807a40dc clone: open a shortcut for connectivity check
In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.

In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":

 - all refs point to an object in the pack
 - there are no dangling pointers in any object in the pack
 - no objects in the pack point to objects outside the pack

The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.

"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.

Cloning linux-2.6 over file://

        before         after
real    3m25.693s      2m53.050s
user    5m2.037s       4m42.396s
sys     0m13.750s      0m16.574s

A more realistic test with ssh:// over wireless

        before         after
real    11m26.629s     10m4.213s
user    5m43.196s      5m19.444s
sys     0m35.812s      0m37.630s

This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.

This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-28 08:07:20 -07:00
Thomas Ackermann
d5fa1f1a69 The name of the hash function is "SHA-1", not "SHA1"
Use "SHA-1" instead of "SHA1" whenever we talk about the hash function.
When used as a programming symbol, we keep "SHA1".

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 11:08:37 -07:00
Thomas Ackermann
2de9b71138 Documentation: the name of the system is 'Git', not 'git'
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01 13:53:33 -08:00
Nguyễn Thái Ngọc Duy
b8a2486f15 index-pack: support multithreaded delta resolving
This puts delta resolving on each base on a separate thread, one base
cache per thread. Per-thread data is grouped in struct thread_local.
When running with nr_threads == 1, no pthreads calls are made. The
system essentially runs in non-thread mode.

An experiment on a Xeon 24 core machine with git.git shows that
performance does not increase proportional to the number of cores. So
by default, we use maximum 3 cores. Some numbers with --threads from 1
to 16:

1..4
real    0m8.003s  0m5.307s  0m4.321s  0m3.830s
user    0m7.720s  0m8.009s  0m8.133s  0m8.305s
sys     0m0.224s  0m0.372s  0m0.360s  0m0.360s

5..8
real    0m3.727s  0m3.604s  0m3.332s  0m3.369s
user    0m9.361s  0m9.817s  0m9.525s  0m9.769s
sys     0m0.584s  0m0.624s  0m0.540s  0m0.560s

9..12
real    0m3.036s  0m3.139s  0m3.177s  0m2.961s
user    0m8.977s  0m10.205s 0m9.737s  0m10.073s
sys     0m0.596s  0m0.680s  0m0.684s  0m0.680s

13..16
real    0m2.985s  0m2.894s  0m2.975s  0m2.971s
user    0m9.825s  0m10.573s 0m10.833s 0m11.361s
sys     0m0.788s  0m0.732s  0m0.904s  0m1.016s

On an Intel dual core and linux-2.6.git

1..4
real    2m37.789s 2m7.963s  2m0.920s  1m58.213s
user    2m28.415s 2m52.325s 2m50.176s 2m41.187s
sys     0m7.808s  0m11.181s 0m11.224s 0m10.731s

Thanks Ramsay Jones for troubleshooting and support on MinGW platform.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-07 15:48:15 -07:00
Jeff King
48bb914ed6 doc: drop author/documentation sections from most pages
The point of these sections is generally to:

  1. Give credit where it is due.

  2. Give the reader an idea of where to ask questions or
     file bug reports.

But they don't do a good job of either case. For (1), they
are out of date and incomplete. A much more accurate answer
can be gotten through shortlog or blame.  For (2), the
correct contact point is generally git@vger, and even if you
wanted to cc the contact point, the out-of-date and
incomplete fields mean you're likely sending to somebody
useless.

So let's drop the fields entirely from all manpages except
git(1) itself. We already point people to the mailing list
for bug reports there, and we can update the Authors section
to give credit to the major contributors and point to
shortlog and blame for more information.

Each page has a "This is part of git" footer, so people can
follow that to the main git manpage.
2011-03-11 10:59:16 -05:00
Štěpán Němec
62b4698e55 Use angles for placeholders consistently
Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:29:52 -07:00
Stephen Boyd
738820a913 Documentation: describe --thin more accurately
The description for --thin was misleading and downright wrong. Correct
it with some inspiration from the description of index-pack's --fix-thin
and some background information from Nicolas Pitre <nico@fluxnic.net>.

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-18 17:13:18 -08:00
Thomas Rast
0b444cdb19 Documentation: spell 'git cmd' without dash throughout
The documentation was quite inconsistent when spelling 'git cmd' if it
only refers to the program, not to some specific invocation syntax:
both 'git-cmd' and 'git cmd' spellings exist.

The current trend goes towards dashless forms, and there is precedent
in 647ac70 (git-svn.txt: stop using dash-form of commands.,
2009-07-07) to actively eliminate the dashed variants.

Replace 'git-cmd' with 'git cmd' throughout, except where git-shell,
git-cvsserver, git-upload-pack, git-receive-pack, and
git-upload-archive are concerned, because those really live in the
$PATH.
2010-01-10 13:01:28 +01:00
Jonathan Nieder
ba020ef5eb manpages: italicize git command names (which were in teletype font)
The names of git commands are not meant to be entered at the
commandline; they are just names. So we render them in italics,
as is usual for command names in manpages.

Using

	doit () {
	  perl -e 'for (<>) { s/\`(git-[^\`.]*)\`/'\''\1'\''/g; print }'
	}
	for i in git*.txt config.txt diff*.txt blame*.txt fetch*.txt i18n.txt \
	        merge*.txt pretty*.txt pull*.txt rev*.txt urls*.txt
	do
	  doit <"$i" >"$i+" && mv "$i+" "$i"
	done
	git diff

.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-05 11:24:40 -07:00
Jonathan Nieder
483bc4f045 Documentation formatting and cleanup
Following what appears to be the predominant style, format
names of commands and commandlines both as `teletype text`.

While we're at it, add articles ("a" and "the") in some
places, italicize the name of the command in the manual page
synopsis line, and add a comma or two where it seems appropriate.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01 17:20:16 -07:00
Jonathan Nieder
b1889c36d8 Documentation: be consistent about "git-" versus "git "
Since the git-* commands are not installed in $(bindir), using
"git-command <parameters>" in examples in the documentation is
not a good idea. On the other hand, it is nice to be able to
refer to each command using one hyphenated word. (There is no
escaping it, anyway: man page names cannot have spaces in them.)

This patch retains the dash in naming an operation, command,
program, process, or action. Complete command lines that can
be entered at a shell (i.e., without options omitted) are
made to use the dashless form.

The changes consist only of replacing some spaces with hyphens
and vice versa. After a "s/ /-/g", the unpatched and patched
versions are identical.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01 17:20:15 -07:00
Christian Couder
9e1f0a85c6 documentation: move git(7) to git(1)
As the "git" man page describes the "git" command at the end-user
level, it seems better to move it to man section 1.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-06 11:18:28 -07:00
Martin Koegler
0153be05ae index-pack: introduce checking mode
Adds strict option, which bails out if the pack would
introduces broken object or links in the repository.

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-28 21:56:02 -08:00
Dan McGee
5162e69732 Documentation: rename gitlink macro to linkgit
Between AsciiDoc 8.2.2 and 8.2.3, the following change was made to the stock
Asciidoc configuration:

@@ -149,7 +153,10 @@
 # Inline macros.
 # Backslash prefix required for escape processing.
 # (?s) re flag for line spanning.
-(?su)[\\]?(?P<name>\w(\w|-)*?):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
+# Explicit so they can be nested.
+(?su)[\\]?(?P<name>(http|https|ftp|file|mailto|callto|image|link)):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
 # Anchor: [[[id]]]. Bibliographic anchor.
 (?su)[\\]?\[\[\[(?P<attrlist>[\w][\w-]*?)\]\]\]=anchor3
 # Anchor: [[id,xreflabel]]

This default regex now matches explicit values, and unfortunately in this
case gitlink was being matched by just 'link', causing the wrong inline
macro template to be applied. By renaming the macro, we can avoid being
matched by the wrong regex.

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-06 18:41:44 -08:00
Junio C Hamano
5f8bee5859 Documentation: fix "gitlink::foobar[s]"
They should be spelled with a single colon.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-06 18:41:43 -08:00
Ralf Wildenhues
06ada1529c Fix some typos, punctuation, missing words, minor markup.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-15 22:11:02 -04:00
Junio C Hamano
a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Junio C Hamano
404fdef22f Merge branch 'maint'
* maint:
  Documentation: Reformatted SYNOPSIS for several commands
  Documentation: Added [verse] to SYNOPSIS where necessary
2007-05-18 21:50:56 -07:00
Matthias Kestenholz
97925fde00 Documentation: Reformatted SYNOPSIS for several commands
Signed-off-by: Matthias Kestenholz <matthias@spinlock.ch>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-18 21:47:45 -07:00
Nicolas Pitre
be18c1fe12 document --index-version for index-pack and pack-objects
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-19 20:38:00 -07:00
Nicolas Pitre
576162a45f remove .keep pack lock files when done with refs update
This makes both git-fetch and git-push (fetch-pack and receive-pack)
safe against a possible race with aparallel git-repack -a -d that could
prune the new pack while it is not yet referenced, and remove the .keep
file after refs have been updated.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-03 00:24:07 -08:00
Shawn Pearce
b807770924 Teach git-index-pack how to keep a pack file.
To prevent a race condition between `index-pack --stdin` and
`repack -a -d` where the repack deletes the newly created pack
file before any refs are updated to reference objects contained
within it we mark the pack file as one that should be kept.  This
removes it from the list of packs that `repack -a -d` will consider
for removal.

Callers such as `receive-pack` which want to invoke `index-pack`
should use this new --keep option to prevent the newly created pack
and index file pair from being deleted before they have finished any
related ref updates.  Only after all ref updates have been finished
should the associated .keep file be removed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-29 13:45:03 -08:00
Nicolas Pitre
3c9af36646 add progress status to index-pack
This is more interesting to look at when performing a big fetch.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-26 00:20:13 -07:00
Nicolas Pitre
636171cb80 make index-pack able to complete thin packs.
A new flag, --fix-thin, instructs git-index-pack to append any missing
objects to a thin pack to make it self contained and indexable. Of course
objects missing from the pack must be present elsewhere in the local
repository.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-26 00:20:07 -07:00
Nicolas Pitre
e42797f5b6 enable index-pack streaming capability
A new flag, --stdin, allows for a pack to be received over a stream.
When this flag is provided, the pack content is written to either
the named pack file or directly to the object repository under the
same name as produced by git-repack.  The pack index is written as
well with the corresponding base name, unless the index name is
overriden with -o.

With this patch, git-index-pack could be used instead of
git-unpack-objects when fetching remote objects but only with
non "thin" packs for now.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-25 14:39:07 -07:00
Sergey Vlasov
9cf6d3357a Add git-index-pack utility
git-index-pack builds a pack index file for an existing packed
archive.  With this utility a packed archive which was transferred
without the corresponding pack index can be added to objects/pack/
without repacking.

Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-12 18:32:02 -07:00