Commit Graph

60 Commits

Author SHA1 Message Date
Junio C Hamano
fb6a3d8621 Document --strict flag to the fsck-cache command.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-27 18:57:14 -07:00
Linus Torvalds
64071805ed git-fsck-cache: be stricter about "tree" objects
In particular, warn about things like zero-padding of the mode bits,
which is a big no-no, since it makes otherwise identical trees have
different representations (and thus different SHA1 numbers).

Also make the warnings more regular.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 18:57:14 -07:00
Linus Torvalds
de2eb7f694 git-fsck-cache.c: check commit objects more carefully
We historically used to be very careful in fsck-cache, but when it was
re-written to use "parse_object()" instead of parsing everything by
hand, it lost a bit of the checks.  This, together with the previous
commit, should make it do more proper commit object syntax checks.

Also add a "--strict" flag, which warns about the old-style "0664" file
mode bits, which shouldn't exist in modern trees, but that happened
early on in git trees and that the default git-fsck-cache thus silently
accepts.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 18:57:03 -07:00
Junio C Hamano
a3eb250f99 [PATCH] alternate object store and fsck
The location alt_odb[j].name[0..] is filled with ??/?{38} to form a sha1
filename to try, but I was too lazy to allocate a copy, so while
fsck_object_dir() is running for the directory, the filenames ??/?{38}
are filled after NUL (usually and always the location should have '/'),
making them "not found".

This should fix it.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-10 16:16:34 -07:00
Linus Torvalds
659cacf5a9 git-fsck-cache: don't complain about lacking references when they are all in packs.
We used to not count them at all, which then made us complain that there
were no refs.
2005-07-07 17:05:41 -07:00
Linus Torvalds
c33303839c Make git-fsck-cache check HEAD integrity
In particular, check that it's a symlink, and points to refs/heads/.  We
depend on that these days not only for "git checkout", but also because
fsck and others only check for references in the .git/refs/
subdirectory, not things like HEAD itself.
2005-07-03 10:40:38 -07:00
Linus Torvalds
944d858969 Fix up "for_each_ref()" to be more usable, and use it in git-fsck-cache
It needed to take the GIT_DIR information into account, something that
the original receive-pack usage just never cared about.
2005-07-03 10:01:38 -07:00
Junio C Hamano
f3bf922409 [PATCH] verify-pack updates.
Nico pointed out that having verify_pack.c and verify-pack.c was
confusing.  Rename verify_pack.c to pack-check.c as suggested,
and enhances the verification done quite a bit.

 - Built-in sha1_file unpacking knows that a base object of a
   deltified object _must_ be in the same pack, and takes
   advantage of that fact.

 - Earlier verify-pack command only checked the SHA1 sum for the
   entire pack file and did not look into its contents.  It now
   checks everything idx file claims to have unpacks correctly.

 - It now has a hook to give more detailed information for
   objects contained in the pack under -v flag.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 22:33:47 -07:00
Junio C Hamano
f9253394a2 [PATCH] Add git-verify-pack command.
Given a list of <pack>.idx files, this command validates the
index file and the corresponding .pack file for consistency.

This patch also uses the same validation mechanism in fsck-cache
when the --full flag is used.

During normal operation, sha1_file.c verifies that a given .idx
file matches the .pack file by comparing the SHA1 checksum
stored in .idx file and .pack file as a minimum sanity check.
We may further want to check the pack signature and version when
we map the pack, but that would be a separate patch.

Earlier, errors to map a pack file was not flagged fatal but led
to a random fatal error later.  This version explicitly die()s
when such an error is detected.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29 09:11:39 -07:00
Junio C Hamano
8a498a05c3 [PATCH] Update fsck-cache (take 2)
The fsck-cache complains if objects referred to by files in .git/refs/
or objects stored in files under .git/objects/??/ are not found as
stand-alone SHA1 files (i.e.  found in alternate object pools
GIT_ALTERNATE_OBJECT_DIRECTORIES or packed archives stored under
.git/objects/pack).

Although this is a good semantics to maintain consistency of a single
.git/objects directory as a self contained set of objects, it sometimes
is useful to consider it is OK as long as these "outside" objects are
available.

This commit introduces a new flag, --standalone, to git-fsck-cache.
When it is not specified, connectivity checks and .git/refs pointer
checks are taught that it is OK when expected objects do not exist under
.git/objects/?? hierarchy but are available from an packed archive or in
an alternate object pool.

Another new flag, --full, makes git-fsck-cache to check not only the
current GIT_OBJECT_DIRECTORY but also objects found in alternate object
pools and packed GIT archives.a

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 15:17:12 -07:00
Junio C Hamano
c4584ae3fd [PATCH] Remove "delta" object representation.
Packed delta files created by git-pack-objects seems to be the
way to go, and existing "delta" object handling code has exposed
the object representation details to too many places.  Remove it
while we refactor code to come up with a proper interface in
sha1_file.c.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:27:51 -07:00
Linus Torvalds
e4bcaac17e Don't ignore reachability of tag objects in fsck
We used to ignore unreachable tags, which just causes problems: it makes
"git prune" leave them around, but since we'll have prune everything
that tag points to, the tag object really should be removed too.

So remove the code that made us think tags were always reachable.
2005-06-22 19:06:34 -07:00
Linus Torvalds
477606f57d git-fsck-cache: complain if no default references found 2005-06-05 09:55:27 -07:00
Linus Torvalds
bd1e17e245 Make "parse_object()" also fill in commit message buffer data.
And teach fsck to free it to save memory.
2005-05-25 19:26:28 -07:00
Nicolas Pitre
d1af002dc6 [PATCH] delta check
This adds knowledge of delta objects to fsck-cache and various object
parsing code.  A new switch to git-fsck-cache is provided to display the
maximum delta depth found in a repository.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:41:45 -07:00
Linus Torvalds
7c4d07c7cc fsck-cache: fix segfault on nonexistent referenced object
Noted by Frank Sorenson and Petr Baudis, patch rewritten by me.
2005-05-20 07:49:17 -07:00
Alexey Nezhdanov
667bb59b2d [PATCH] cleanup of in-code names
Fixes all in-code names that leaved during "big name change".

Signed-off-by: Alexey Nezhdanov <snake@penza-gsm.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-19 10:52:00 -07:00
Linus Torvalds
e7bd907db6 fsck-cache: read the default reference information even when
not doing reachability analysis.

This avoids the dangling head problem, and means that just a
plain "git-fsck-cache" with no parameters will DTRT.
2005-05-18 10:19:59 -07:00
Linus Torvalds
1024932f01 fsck-cache: walk the 'refs' directory if the user doesn't give any
explicit references for reachability analysis.

We already had that as separate logic in git-prune-script, so this
is not a new special case - it's an old special case moved into
fsck, making normal usage be much simpler.
2005-05-18 10:16:14 -07:00
Junio C Hamano
a4f35a2dc0 Notice tree objects with duplicate entries.
This is a follow-up fix to the earlier "Notice index that has
path and path/file and refuse to write such a tree" patch.
With this fix, git-fsck-cache complains if a tree object stores
more than one entries with the same name.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-07 14:43:32 -07:00
Junio C Hamano
ace1534d6f Introduce SHA1_FILE_DIRECTORIES to support multiple object databases.
SHA1_FILE_DIRECTORIES environment variable is a colon separated paths
used when looking for SHA1 files not found in the usual place for
reading.  Creating a new SHA1 file does not use this alternate object
database location mechanism.  This is useful to archive older, rarely
used objects into separate directories.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-07 00:38:04 -07:00
Linus Torvalds
42ea9cb286 Be more careful about tree entry modes.
The tree object parsing used to get the executable bit wrong,
and didn't know about symlinks. Also, fsck really wants the
full mode value so that it can verify the other bits for sanity,
so save it all in struct tree_entry.
2005-05-05 16:18:48 -07:00
Linus Torvalds
d0d0d0bd3c Merge http://members.cox.net/junkio/git-jc.git/ 2005-05-04 18:18:40 -07:00
Linus Torvalds
770896e548 Teach fsck-cache to accept non-commits for reachability analysis.
In particular, teach it about tags. Also, to make reachability actually
work for tags, we need to add the ref to the tagged object.
2005-05-04 17:03:09 -07:00
Junio C Hamano
ae7c0c92c0 Git-prune-script loses blobs referenced from an uncommitted cache.
(updated from the version posted to GIT mailing list).

When a new blob is registered with update-cache, and before the cache
is written as a tree and committed, git-fsck-cache will find the blob
unreachable.  This patch adds a new flag, "--cache" to git-fsck-cache,
with which it keeps such blobs from considered "unreachable".

The git-prune-script is updated to use this new flag.  At the same time
it adds .git/refs/*/* to the set of default locations to look for heads,
which should be consistent with expectations from Cogito users.

Without this fix, "diff-cache -p --cached" after git-prune-script has
pruned the blob object will fail mysteriously and git-write-tree would
also fail.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-04 01:33:33 -07:00
Linus Torvalds
92d4c85d24 fsck-cache: fix SIGSEGV on bad tag object
fsck_tag() failes to notice that the parsing of the tag may
have failed in the parse_object() call on the object that it
is tagging. 

Noticed by Junio.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-03 07:57:56 -07:00
Linus Torvalds
aa03413467 fsck-cache: report broken links correctly
We reported the type of te missing object incorrectly: we reported it as
the type of the referrer object, not the object that was referred to.
2005-05-02 21:10:54 -07:00
Linus Torvalds
8500349208 Make fsck-cache do better tree checking.
We check the ordering of the entries, and we verify that none
of the entries has a slash in it (this allows us to remove the
hacky "has_full_path" member from the tree structure, since we
now just test it by walking the tree entries instead).
2005-05-02 16:13:18 -07:00
Linus Torvalds
7e8c174a97 fsck-cache: sort entries by inode number
This improves the cold-cache behaviour on most filesystems,
since it makes the fsck access patterns more regular on
the disk, rather than seeking back and forth.

Note the "most". Not all filesystems have any relationship
between inode number and location on disk.
2005-05-02 09:06:33 -07:00
Linus Torvalds
3c249c9506 Add "get_sha1()" helper function.
This allows the programs to use various simplified versions of
the SHA1 names, eg just say "HEAD" for the SHA1 pointed to by
the .git/HEAD file etc.

For example, this commit has been done with

	git-commit-tree $(git-write-tree) -p HEAD

instead of the traditional "$(cat .git/HEAD)" syntax.
2005-05-01 16:36:56 -07:00
Linus Torvalds
3a6a23e67d Make git-fsck-cache error printouts a bit more informative.
Show the types of objects involved in broken links, and don't bother
warning about unreachable tag files (if somebody cares about tags,
they'll use the --tags flag to see them).
2005-04-30 11:22:26 -07:00
Linus Torvalds
4b18242190 Fix up d_type handling - we need to include <dirent.h> before
we play with the d_type compatibility macros.
2005-04-30 09:59:31 -07:00
Jonas Fonseca
e1a1388d85 [PATCH] git-fsck-cache: Gracefully handle non-commit IDs
Gracefully handle non-commit IDs instead of segfaulting.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-29 20:00:40 -07:00
Daniel Barkalow
c418eda493 [PATCH] Rework fsck-cache to use parse_object()
With support for parse_object() and tags in the core, fsck_cache can just
call them, and can be simplified a bit.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-28 07:46:33 -07:00
Linus Torvalds
ab7df1874d fsck-cache: show root objects only with "--root"
This makes the default fsck behaviour be quiet for a repository
that doesn't have any problems. Which is good.
2005-04-25 16:34:13 -07:00
Linus Torvalds
889262eacf fsck-cache: only show tags if asked to do so with "--tags"
Normally we don't care, we just check them for being valid tag
objects.
2005-04-25 16:31:13 -07:00
Linus Torvalds
56ce69f7af Make "fsck" also show what the name of the tag object is, not just
the name of the object it tags.

You need this if you actually want to build up a list of tags.
2005-04-25 15:21:49 -07:00
Linus Torvalds
ec4465adb3 Add "tag" objects that can be used to sign other objects.
You use "git-mktag" to create them, and fsck-cache knows how to parse them.
2005-04-25 12:07:44 -07:00
Linus Torvalds
e6948b6d88 fsck-cache: warn about missing commit dates
Now that we have hopefully converted all old archives, we
can consider it an error.
2005-04-24 16:20:53 -07:00
Linus Torvalds
4728b861ac fsck-cache: notice missing "blob" objects.
We should _not_ mark a blob object "parsed" just because we
looked it up: it gets marked that way only once we've actually
seen it. Otherwise we can never notice a missing blob.
2005-04-24 14:10:55 -07:00
Linus Torvalds
d98b46f8d9 Do SHA1 hash _before_ compression.
And add a "convert-cache" program to convert from old-style
to new-style.
2005-04-20 01:10:46 -07:00
Linus Torvalds
f43b8abc6f Make fsck-cache print the object type for unreachable objects.
This got lost when I updated to Daniel's new object model.
2005-04-18 17:35:31 -07:00
Linus Torvalds
b51ad43140 Merge the new object model thing from Daniel Barkalow
This was a real git merge with conflicts. I'll commit the scripts I used
to do the merge next.

Not pretty, but it's half-way functional.
2005-04-18 12:12:00 -07:00
Daniel Barkalow
ff5ebe39b0 [PATCH] Port fsck-cache to use parsing functions
This ports fsck-cache to use parsing functions. Note that performance
could be improved here by only reading each object once, but this requires
somewhat more complicated flow control.

Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-18 11:39:48 -07:00
Linus Torvalds
08ca0b04ba Make the revision tracking track the object types too.
This allows fsck to verify not just that an object exists, but
also that it has the type it was expected to have.
2005-04-17 16:19:32 -07:00
Linus Torvalds
01796b0e91 Make "revision.h" slightly better to use.
- mark_reachable() can be more generic, marking the reachable revisions
   with an arbitrary mask.
 - date parsing will parse to a date of 0 rather than ULONG_MAX for the
   bad old case, sorting the dates correctly.
2005-04-17 12:07:00 -07:00
Linus Torvalds
458754a9fe Use common "revision.h" header for both fsck and rev-tree.
It's really a very generic thing: the notion of one sha1 revision
referring to another one. "fsck" uses it for all nodes, and "rev-tree"
only tracks commit-node relationships, but the code was already
the same - now we just make that explicit by moving it to a common
header file.
2005-04-13 21:37:59 -07:00
Linus Torvalds
bcee6fd8e7 Make 'fsck' able to take an arbitrary number of parents on the
command line.

"arbitrary" is a bit wrong, since it is limited by the argument
size limit (128kB or so), but let's see if anybody ever cares.
Arguably you should prune your tree before you have a few thousand
dangling heads in your archive.

We can fix it by passing in a file listing if we ever care.
2005-04-13 16:42:09 -07:00
Linus Torvalds
2845dbe4a4 Make fsck reachability avoid doing unnecessary work for
parents that we reach multiple ways.

This doesn't matter right now. It _will_ matter once we have
complex revision graphs.
2005-04-13 12:35:08 -07:00
Linus Torvalds
d9839e0305 Make "fsck-cache" use the same revision tracking structure as "rev-tree".
This makes things a lot more efficient, and makes it trivial to do things
like reachability analysis.

Add command line flags to tell what the head is, and whether to warn
about unreachable objects.
2005-04-13 09:57:30 -07:00