Commit Graph

38 Commits

Author SHA1 Message Date
Junio C Hamano
ea327760d3 Merge branch 'jk/fsck-doc'
"git fsck --connectivity-only" omits computation necessary to sift
the objects that are not reachable from any of the refs into
unreachable and dangling.  This is now enabled when dangling
objects are requested (which is done by default, but can be
overridden with the "--no-dangling" option).

* jk/fsck-doc:
  fsck: always compute USED flags for unreachable objects
  doc/fsck: clarify --connectivity-only behavior
2019-03-20 15:16:06 +09:00
Jeff King
8d8c2a5aef fsck: always compute USED flags for unreachable objects
The --connectivity-only option avoids opening every object, and instead
just marks reachable objects with a flag and compares this to the set
of all objects. This strategy is discussed in more detail in 3e3f8bd608
(fsck: prepare dummy objects for --connectivity-check, 2017-01-17).

This means that we report _every_ unreachable object as dangling.
Whereas in a full fsck, we'd have actually opened and parsed each of
those unreachable objects, marking their child objects with the USED
flag, to mean "this was mentioned by another object". And thus we can
report only the tip of an unreachable segment of the object graph as
dangling.

You can see this difference with a trivial example:

  tree=$(git hash-object -t tree -w /dev/null)
  one=$(echo one | git commit-tree $tree)
  two=$(echo two | git commit-tree -p $one $tree)

Running `git fsck` will report only $two as dangling, but with
--connectivity-only, both commits (and the tree) are reported. Likewise,
using --lost-found would write all three objects.

We can make --connectivity-only work like the normal case by taking a
separate pass over the unreachable objects, parsing them and marking
objects they refer to as USED. That still avoids parsing any blobs,
though we do pay the cost to access any unreachable commits and trees
(which may or may not be noticeable, depending on how many you have).

If neither --dangling nor --lost-found is in effect, then we can skip
this step entirely, just like we do now. That makes "--connectivity-only
--no-dangling" just as fast as the current "--connectivity-only". I.e.,
we do the correct thing always, but you can still tweak the options to
make it faster if you don't care about dangling objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-05 22:55:57 +09:00
Jeff King
df805ed6cf doc/fsck: clarify --connectivity-only behavior
On reading this again, there are two things that were not immediately
clear to me:

  - we do still check links to blobs, even though we don't open the
    blobs themselves

  - we do not do the normal fsck checks, even for non-blob objects we do
    open

Let's reword it to make these points a little more clear.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-05 22:53:52 +09:00
Jeff King
01f8d5948a prefer "hash mismatch" to "sha1 mismatch"
To future-proof ourselves against a change in the hash, let's use the
more generic "hash mismatch" to refer to integrity problems. Note that
we do advertise this exact string in git-fsck(1). However, the message
itself is marked for translation, meaning we do not expect it to be
machine-readable.

While we're touching that documentation, let's also update it for
grammar and clarity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08 09:41:06 -08:00
Derrick Stolee
e0fd51e1d7 fsck: verify commit-graph
If core.commitGraph is true, verify the contents of the commit-graph
during 'git fsck' using the 'git commit-graph verify' subcommand. Run
this check on all alternates, as well.

We use a new process for two reasons:

1. The subcommand decouples the details of loading and verifying a
   commit-graph file from the other fsck details.

2. The commit-graph verification requires the commits to be loaded
   in a specific order to guarantee we parse from the commit-graph
   file for some objects and from the object database for others.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-27 10:29:10 -07:00
Johannes Schindelin
90cf590f53 fsck: optionally show more helpful info for broken links
When reporting broken links between commits/trees/blobs, it would be
quite helpful at times if the user would be told how the object is
supposed to be reachable.

With the new --name-objects option, git-fsck will try to do exactly
that: name the objects in a way that shows how they are reachable.

For example, when some reflog got corrupted and a blob is missing that
should not be, the user might want to remove the corresponding reflog
entry. This option helps them find that entry: `git fsck` will now
report something like this:

	broken link from    tree b5eb6ff...  (refs/stash@{<date>}~37:)
	              to    blob ec5cf80...

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 15:15:59 -07:00
Matthieu Moy
bcf9626a71 doc: typeset long command-line options as literal
Similarly to the previous commit, use backquotes instead of
forward-quotes, for long options.

This was obtained with:

  perl -pi -e "s/'(--[a-z][a-z=<>-]*)'/\`\$1\`/g" *.txt

and manual tweak to remove false positive in ascii-art (o'--o'--o' to
describe rewritten history).

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28 08:36:45 -07:00
Johannes Schindelin
02976bf856 fsck: introduce git fsck --connectivity-only
This option avoids unpacking each and all blob objects, and just
verifies the connectivity. In particular with large repositories, this
speeds up the operation, at the expense of missing corrupt blobs,
ignoring unreachable objects and other fsck issues, if any.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 14:27:37 -07:00
Felipe Contreras
0460ed2c93 documentation: trivial style cleanups
White-spaces, missing braces, standardize --[no-]foo.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-17 12:09:21 -07:00
Thomas Ackermann
d5fa1f1a69 The name of the hash function is "SHA-1", not "SHA1"
Use "SHA-1" instead of "SHA1" whenever we talk about the hash function.
When used as a programming symbol, we keep "SHA1".

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 11:08:37 -07:00
Thomas Ackermann
2de9b71138 Documentation: the name of the system is 'Git', not 'git'
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01 13:53:33 -08:00
Junio C Hamano
7e0c2036b4 Merge branch 'jc/tag-doc'
Our documentation used to assume having files in .git/refs/*
directories was the only to have branches and tags, but that is not
true for quite some time.

* jc/tag-doc:
  Documentation: do not mention .git/refs/* directories
2012-08-22 11:52:55 -07:00
Junio C Hamano
831e61f80f Documentation: do not mention .git/refs/* directories
It is an implementation detail that a new tag is created by adding a
file in the .git/refs/tags directory.  The only thing the user needs
to know is that a "git tag" creates a ref in the refs/tags namespace,
and without "-f", it does not overwrite an existing tag.

Inspired by a report from 乙酸鋰 <ch3cooli@gmail.com>; I think I
caught all the existing mention in Documentation/ directory in the
tip of 1.7.9.X maintenance track, but we may have added new ones
since then.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-06 14:04:16 -07:00
Junio C Hamano
a8747a1098 fsck doc: a minor typofix
Reword the misspelled "squelch" noticed by Hermann Gaustere to say
"omit", which would sit better anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-05 11:32:19 -08:00
Junio C Hamano
c6a13b2c86 fsck: --no-dangling omits "dangling object" information
The default output from "fsck" is often overwhelmed by informational
message on dangling objects, especially if you do not repack often, and a
real error can easily be buried.

Add "--no-dangling" option to omit them, and update the user manual to
demonstrate its use.

Based on a patch by Clemens Buchacher, but reverted the part to change
the default to --no-dangling, which is unsuitable for the first patch.
The usual three-step procedure to break the backward compatibility over
time needs to happen on top of this, if we were to go in that direction.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-28 14:55:39 -08:00
Junio C Hamano
a4c628d71d Merge branch 'jk/doc-fsck'
* jk/doc-fsck:
  docs: brush up obsolete bits of git-fsck manpage

Conflicts:
	Documentation/git-fsck.txt
2011-12-22 11:27:27 -08:00
Jeff King
2830308260 docs: brush up obsolete bits of git-fsck manpage
After the description and options, the fsck manpage contains
some discussion about what it does. Over time, this
discussion has become somewhat obsolete, both in content and
formatting. In particular:

  1. There are many options now, so starting the discussion
     with "It tests..." makes it unclear whether we are
     talking about the last option, or about the tool in
     general. Let's start a new "discussion" section and
     make our antecedent more clear.

  2. It gave an example for --unreachable using for-each-ref
     to mention all of the heads, saying that it will do "a
     _lot_ of verification". This is hopelessly out-of-date,
     as giving no arguments will check much more (reflogs,
     the index, non-head refs).

  3. It goes on to mention tests "to be added" (like tree
     object sorting). We now have these tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-16 16:24:35 -08:00
Nguyễn Thái Ngọc Duy
1e49f22f07 fsck: print progress
fsck is usually a long process and it would be nice if it prints
progress from time to time.

Progress meter is not printed when --verbose is given because
--verbose prints a lot, there's no need for "alive" indicator.
Progress meter may provide "% complete" information but it would
be lost anyway in the flood of text.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-11-06 20:31:28 -08:00
Jim Meyering
43d532e6ba Documentation/git-fsck.txt: fix typo: unreadable -> unreachable
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-16 16:35:08 -07:00
Jeff King
48bb914ed6 doc: drop author/documentation sections from most pages
The point of these sections is generally to:

  1. Give credit where it is due.

  2. Give the reader an idea of where to ask questions or
     file bug reports.

But they don't do a good job of either case. For (1), they
are out of date and incomplete. A much more accurate answer
can be gotten through shortlog or blame.  For (2), the
correct contact point is generally git@vger, and even if you
wanted to cc the contact point, the out-of-date and
incomplete fields mean you're likely sending to somebody
useless.

So let's drop the fields entirely from all manpages except
git(1) itself. We already point people to the mailing list
for bug reports there, and we can update the Authors section
to give credit to the major contributors and point to
shortlog and blame for more information.

Each page has a "This is part of git" footer, so people can
follow that to the main git manpage.
2011-03-11 10:59:16 -05:00
Mark Lodato
0c806a06a3 fsck docs: remove outdated and useless diagnostic
In git-fsck(1), there was a reference to the warning "<tree> has full
pathnames in it".  This exact wording has not been used since 2005
(commit f1f0d0889e), when the wording was changed slightly.  More
importantly, the description of that warning was useless, and there were
many other similar warning messages which were not document at all.
Since all these warnings are fairly obvious, there is no need for them
to be in the man page.

Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-18 22:58:40 -08:00
Thomas Rast
0b444cdb19 Documentation: spell 'git cmd' without dash throughout
The documentation was quite inconsistent when spelling 'git cmd' if it
only refers to the program, not to some specific invocation syntax:
both 'git-cmd' and 'git cmd' spellings exist.

The current trend goes towards dashless forms, and there is precedent
in 647ac70 (git-svn.txt: stop using dash-form of commands.,
2009-07-07) to actively eliminate the dashed variants.

Replace 'git-cmd' with 'git cmd' throughout, except where git-shell,
git-cvsserver, git-upload-pack, git-receive-pack, and
git-upload-archive are concerned, because those really live in the
$PATH.
2010-01-10 13:01:28 +01:00
Junio C Hamano
f29cd3938d fsck: default to "git fsck --full"
Linus and other git developers from the early days trained their fingers
to type the command, every once in a while even without thinking, to check
the consistency of the repository back when the lower core part of the git
was still being developed.  Developers who wanted to make sure that git
correctly dealt with packfiles could deliberately trigger their creation
and checked them after they were created carefully, but loose objects are
the ones that are written by various commands from random codepaths.  It
made some technical sense to have a mode that checked only loose objects
from the debugging point of view for that reason.

Even for git developers, there no longer is any reason to type "git fsck"
every five minutes these days, worried that some newly created objects
might be corrupt due to recent change to git.

The reason we did not make "--full" the default is probably we trust our
filesystems a bit too much.  At least, we trusted filesystems more than we
trusted the lower core part of git that was under development.

Once a packfile is created and we always use it read-only, there didn't
seem to be much point in suspecting that the underlying filesystems or
disks may corrupt them in such a way that is not caught by the SHA-1
checksum over the entire packfile and per object checksum.  That trust in
the filesystems might have been a good tradeoff between fsck performance
and reliability on platforms git was initially developed on and for, but
it may not be true anymore as we run on many more platforms these days.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-20 12:11:39 -07:00
Markus Heidelberg
27c03aafdf doc/git-fsck: change the way for getting heads' SHA1s
The straightforward way with using 'cat .git/refs/heads/*' doesn't work
with packed refs as well as branches of the form topic/topic1. So let's
use git-for-each-ref for getting the heads' SHA1s in this example.

Signed-off-by: Markus Heidelberg <markus.heidelberg@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21 12:04:12 -08:00
Jonathan Nieder
2fd02c92db manpages: italicize nongit command names (if they are in teletype font)
Some manual pages use teletype font to set command names. We
change them to use italics, instead.  This creates a visual
distinction between names of commands and command lines that
can be typed at the command line. It is also more consistent
with other man pages outside Git.

In this patch, the commands named are non-git commands like bash.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-05 11:24:40 -07:00
Jonathan Nieder
ba020ef5eb manpages: italicize git command names (which were in teletype font)
The names of git commands are not meant to be entered at the
commandline; they are just names. So we render them in italics,
as is usual for command names in manpages.

Using

	doit () {
	  perl -e 'for (<>) { s/\`(git-[^\`.]*)\`/'\''\1'\''/g; print }'
	}
	for i in git*.txt config.txt diff*.txt blame*.txt fetch*.txt i18n.txt \
	        merge*.txt pretty*.txt pull*.txt rev*.txt urls*.txt
	do
	  doit <"$i" >"$i+" && mv "$i+" "$i"
	done
	git diff

.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-05 11:24:40 -07:00
Jonathan Nieder
483bc4f045 Documentation formatting and cleanup
Following what appears to be the predominant style, format
names of commands and commandlines both as `teletype text`.

While we're at it, add articles ("a" and "the") in some
places, italicize the name of the command in the manual page
synopsis line, and add a comma or two where it seems appropriate.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01 17:20:16 -07:00
Jonathan Nieder
b1889c36d8 Documentation: be consistent about "git-" versus "git "
Since the git-* commands are not installed in $(bindir), using
"git-command <parameters>" in examples in the documentation is
not a good idea. On the other hand, it is nice to be able to
refer to each command using one hyphenated word. (There is no
escaping it, anyway: man page names cannot have spaces in them.)

This patch retains the dash in naming an operation, command,
program, process, or action. Complete command lines that can
be entered at a shell (i.e., without options omitted) are
made to use the dashless form.

The changes consist only of replacing some spaces with hyphens
and vice versa. After a "s/ /-/g", the unpatched and patched
versions are identical.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01 17:20:15 -07:00
Christian Couder
9e1f0a85c6 documentation: move git(7) to git(1)
As the "git" man page describes the "git" command at the end-user
level, it seems better to move it to man section 1.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-06 11:18:28 -07:00
Jeff King
8d308b3540 Documentation: point git-prune users to git-gc
Most users should be using git-gc instead of directly
calling prune. For those who really do want more information
on pruning, let's point them at git-fsck, which goes into
slightly more detail on reachability.

And since we're pointing users there, let's make sure
reflogs are mentioned in git-fsck(1).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-29 23:11:57 -07:00
Dan McGee
5162e69732 Documentation: rename gitlink macro to linkgit
Between AsciiDoc 8.2.2 and 8.2.3, the following change was made to the stock
Asciidoc configuration:

@@ -149,7 +153,10 @@
 # Inline macros.
 # Backslash prefix required for escape processing.
 # (?s) re flag for line spanning.
-(?su)[\\]?(?P<name>\w(\w|-)*?):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
+# Explicit so they can be nested.
+(?su)[\\]?(?P<name>(http|https|ftp|file|mailto|callto|image|link)):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
 # Anchor: [[[id]]]. Bibliographic anchor.
 (?su)[\\]?\[\[\[(?P<attrlist>[\w][\w-]*?)\]\]\]=anchor3
 # Anchor: [[id,xreflabel]]

This default regex now matches explicit values, and unfortunately in this
case gitlink was being matched by just 'link', causing the wrong inline
macro template to be applied. By renaming the macro, we can avoid being
matched by the wrong regex.

Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-06 18:41:44 -08:00
Johannes Schindelin
16a7fcfe5e fsck --lost-found: write blob's contents, not their SHA-1
When looking for a lost blob, it is much nicer to be able to grep
through .git/lost-found/other/* than to write an inefficient loop
over the file names.  So write the contents of the dangling blobs,
not their object names.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-22 15:59:27 -07:00
Jonas Fonseca
e4465f0e71 fsck --lost-found writes to subdirectories in .git/lost-found/
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-03 19:08:58 -07:00
Johannes Schindelin
68f6c019fd git-fsck: add --lost-found option
With this option, dangling objects are not only reported, but also
written to .git/lost-found/commit/ or .git/lost-found/other/. This
option implies '--full' and '--no-reflogs'.

'git fsck --lost-found' is meant as a replacement for git-lost-found.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-02 21:34:12 -07:00
Junio C Hamano
a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Johannes Schindelin
20f1eb6b46 git-fsck: learn about --verbose
With --verbose, it gets really chatty now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-04 22:42:49 -07:00
Shawn O. Pearce
566842f62b Fix lost-found to show commits only referenced by reflogs
Prior to 1.5.0 the git-lost-found utility was useful to locate
commits that were not referenced by any ref.  These were often
amends, or resets, or tips of branches that had been deleted.
Being able to locate a 'lost' commit and recover it by creating a
new branch was a useful feature in those days.

Unfortunately 1.5.0 added the reflogs to the reachability analysis
performed by git-fsck, which means that most commits users would
consider to be lost are still reachable through a reflog.  So most
(or all!) commits are reachable, and nothing gets output from
git-lost-found.

Now git-fsck can be told to ignore reflogs during its reachability
analysis, making git-lost-found useful again to locate commits
that are no longer referenced by a ref itself, but may still be
referenced by a reflog.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:00:03 -07:00
Junio C Hamano
df391b192d git-fsck-objects is now synonym to git-fsck
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-28 16:33:58 -08:00