command line.
"arbitrary" is a bit wrong, since it is limited by the argument
size limit (128kB or so), but let's see if anybody ever cares.
Arguably you should prune your tree before you have a few thousand
dangling heads in your archive.
We can fix it by passing in a file listing if we ever care.
This makes things a lot more efficient, and makes it trivial to do things
like reachability analysis.
Add command line flags to tell what the head is, and whether to warn
about unreachable objects.
Changes diff-tree output format so that fields are separated by tabs instead of
spaces (readibility, parseability), and tree entry type is listed along the
entry (avoids having to figure that out from the mode in the scripts).
This is what my scripts expect.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
It seems like the nsec portability is limited; in particular, older
glibcs (<=2.2.4 at least) don't seem to like it. So access the nsec
fields in struct stat only when -DNSEC.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
Ancient cat-file command used to leave temp_git_file_* and there
was support to remove them in the clean target of Makefile. I
do not think it is needed anymore.
From: Junio C Hamano <junkio@cox.net>
Signed-off-by: Petr Baudis <pasky@ucw.cz>
Now there is error() for "library" errors and die() for fatal "application"
errors. usage() is now used strictly only for usage errors.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
The nsec field of ctime/mtime is now checked only with -DNSEC defined during
compilation. nsec acts broken since it is stored in the icache but apparently
just gets to zero when flushed to filesystem not supporting it (e.g. ext3),
creating illusions of false changes. At least that's my impression.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
When compiled with -DCOLLISION_CHECK, we will check against SHA1
collisions when writing to the object database.
From: Christopher Li <chrislgit@chrisli.org>
Signed-off-by: Petr Baudis <pasky@ucw.cz>
ls-tree tool provides just a way to export the binary tree objects
to a usable text format. This is bound to be useful in variety
of scripts, although none of those I have currently uses it.
But e.g. the simple script I've sent to HPA for purging the object
database uses it.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
My convention is that contrary to files trimmed to zero size,
deleted files always go to /dev/null. This patch turns show-diff
to abide this.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
This patch adds a -s flag for show-diff, which will surpress the
actual diffing. This is useful for my scripts when they just want
to see what needs to be updated in the cache.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
Also, add "date" information to the output so that you can do something
like this:
rev-tree `cat .git/HEAD` | sort -nr | cut -d' ' -f2 | while read i; do cat-file commit $i; done
which basically becomes a "git log" (aka "git changes") where things are
sorted by time.
To do the automated commit-mailing I need to be able to answer the
question "which commits are here today but weren't yesterday"... i.e.
given two commit-ids $HEAD and $YESTERDAY I want to be able to do:
rev-tree $HEAD ^$YESTERDAY
to list those commits which are in the tree now but weren't
ancestors of yesterday's head.
Yes, I could probably do this with
rev-tree $HEAD $YESTERDAY | egrep -v ^[a-z0-9]*:3
but I prefer not to.
The ideas is that using the show-diff to generate the
patch including deleted and new file (in the next patch).
So we don't have to do the temp new file diff dance on the
script.
The cache index now contain enough information to generate
the whole patch. So the GIT SCM don't need separate command
for check out file to edit or delete. Just do the edit and
remove and GIT will generate the correct patch.
It still require tell GIT to add new files.
I looked a bit at my old BK tools for the same thing, but they were
just so horrid in many ways that I largely rewrote it all and these
tools do things a bit differently. Instead of aggressively piping
data from one process to another (which was clever but very hard
to follow), this first just splits out the mbox into many smaller
email files, and then does some scripts on these temporary files.
change. Promise.
It now always outputs all the revisions as <sha1>:<reachability>, where the
reachability is the bitmask of how that revision was reachable from the
commits in the argument list.
Trivially, if there is only one commit, the reachability will always be
(1 << 0) == 1 for all reachable revisions, and there won't be any edges
(so the "--edges" flag only makes sense with multiple commit keys).
or more commit points.
This is important both to know what the difference between two commit
points is, but also to figure out where to try to merge from.
Slight change of output format: it now lists all parents on the same line.
This allows it to work on initial commits too (which have no parents), and
also makes the output format a lot more intuitive.
tree graph.
It's quite fast when the commit-objects are cached, but since
it has to walk every single commit-object, it also allows you
to cache an old state and just add on top of that.
After all, if you want to not allow others to read your
stuff, set your "umask" appropriately or make sure the
parent directories aren't readable/executable.
refresh the "stat" information.
We need this after having done a "read-tree", for example, when the
stat information does not match the checked-out tree, and we want to
start getting efficient cache matching against the parts of the tree
that are already up-to-date.
No, this doesn't make them easy to use, but makes diff-tree use
the "-r" flag for "recursive" (not "-R") and makes commit-tree
use AUTHOR_xxx environment flags (not COMMITTER_xxx) to match what
it actually does.
properly clear the reference count at init time. It happened to work
for me by pure luck.
Until it broke, and my unreferenced commit suddenly looked referenced
again. Fixed.
Which made fsck very quiet about objects it hadn't found. So add
it.
We'll need to make things like these optional, because it's
perfectly ok to have partial history if you don't want it,
and don't want to go backwards. But for development, it's best
to always complain about missing sha1 object files that are
referenced from somewhere else.
This shows that I've lost track of one commit already. Most likely
because I forgot to update the .dircache/HEAD file when doing a
commit, so that the next commit referenced not the top-of-tree, but
the one older commit.
Having dangling commits is fine (in fact, you should always have
at least _one_ dangling commit in the top-of-tree). But it's
good to know about them.
Also make the return value of "cache_name_pos()" be sane: positive
or zero if we found it (it's the index into the cache array), and
"-pos-1" to indicate where it should go if we didn't.
And, perhaps more importantly, fix the fact that if a filename changed from a
directory to a file (or vice versa), we must consider it a delete and an add,
not a "filechange".
During original development I had different name-bases for source and
destination, so that I could make the output show how it got removed
from "tree a" and added to "tree b", but we don't want that. We only
do recursive diffs on anything where the bases are exactly the same,
so we might as well just work with a single base.
Also, make the output for "changed" be a single line, since people
hated the separate '<' / '>' format. They were right. It sucked.
It now requires the "--add" flag before you add any new files, and
a "--remove" file if you want to mark files for removal. And giving
it the "--refresh" flag makes it just update all the files that it
already knows about.
It's got some debugging printouts etc still in it, but testing on the
kernel seems to show that it does indeed fix the issue with huge tree
files for each commit.
This is totally untested, since we can't actually _write_ things that
way yet, but I'll get to that next, I hope. That should fix the
huge wasted space for kernel-sized tree objects.