Minor documentation improvements, as suggested on the Git mailing
list by Horst H. von Brand and Karl Hasselström.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
I wrote this documentation with asciidoc 7.1.2, but apparently
asciidoc 8 assumes ^ means superscript. The solution was already
documented in rev-parse's manpage and is to use {caret} instead.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
There has been some informative lessons learned in the gfi user
community, and these really should be written down and documented
for future generations of frontend developers.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Apparently fast-import used to die a horrible death if we
were unable to open the marks file for output. This is
slightly less than ideal, especially now that we dump
the marks as part of the `checkpoint` command.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the frontend asks us to checkpoint (via the explicit checkpoint
command) its probably because they are afraid the current import
will crash/fail/whatever and want to make sure they can pickup from
the last checkpoint. To do that sort of recovery, we will need the
current tip of every branch and tag available at the next startup.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Often users will be running fast-import from within a larger frontend
process, and this may be a frequent periodic tool such as a future
edition of `git-svn fetch`. We don't want to bombard users with our
large stats output if they won't be interested in it, so `--quiet`
is now an option to make gfi be more silent.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some frontends may not be able to (easily) keep track of which files
are included in the branch, and which aren't. Performing this
tracking can be tedious and error prone for the frontend to do,
especially if its foreign data source cannot supply the changed
path list on a per-commit basis.
fast-import now allows a frontend to request that a branch's tree
be wiped clean (reset to the empty tree) at the start of a commit,
allowing the frontend to feed in all paths which belong on the branch.
This is ideal for a tar-file importer frontend, for example, as
the frontend just needs to reformat the tar data stream into a gfi
data stream, which may be something a few Perl regexps can take
care of. :)
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
As discussed on the mailing list, the documentation used here was
not quite accurate. Improve upon it.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If fast-import is being used to update an existing branch of
a repository, the user may not want to lose commits if another
process updates the same ref at the same time. For example, the
user might be using fast-import to make just one or two commits
against a live branch.
We now perform a fast-forward check during the ref updating process.
If updating a branch would cause commits in that branch to be lost,
we skip over it and display the new SHA1 to standard error.
This new default behavior can be overridden with `--force`, like
git-push and git-fetch.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Since some frontends may be working with source material where
the dates are only readily available as RFC 2822 strings, it is
more friendly if fast-import exposes Git's parse_date() function
to handle the conversion. This way the frontend doesn't need
to perform the parsing itself.
The new --date-format option to fast-import can be used by a
frontend to select which format it will supply date strings in.
The default is the standard `raw` Git format, which fast-import
has always supported. Format rfc2822 can be used to activate the
parse_date() function instead.
Because fast-import could also be useful for creating new, current
commits, the format `now` is also supported to generate the current
system timestamp. The implementation of `now` is a trivial call
to datestamp(), but is actually a whole whopping 3 lines so that
fast-import can verify the frontend really meant `now`.
As part of this change I have added validation of the `raw` date
format. Prior to this change fast-import would accept anything
in a `committer` command, even if it was seriously malformed.
Now fast-import requires the '> ' near the end of the string and
verifies the timestamp is formatted properly.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Corrected a couple of header markup lines which were shorter than the
actual header, and made the `data` commands two formats into a named
list, which matches how we document the two formats of the `M` command
within a commit.
Also tried to simplify the language about our decimal integer format;
Linus pointed out I was probably being too specific at the cost of
reduced readability.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
There is no need to check for a NULL pointer before invoking free(),
the runtime library automatically performs this check anyway and
does nothing if a NULL pointer is supplied.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Andy Parkins and Linus Torvalds both noticed that the description
of the timezone was incorrect. Its not expressed in minutes.
Its more like "hhmm", where "hh" is the number of hours and "mm"
is the number of minutes shifted from GMT/UTC.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Junio noticed that I was using a different style in fast-import
for returned pointers than the rest of Git. Before merging this
code into the main git.git tree I'd like to make it consistent,
as this style variation was not intentional.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Junio noticed these warnings/errors in fast-import when compiling
with `-Werror -ansi -pedantic`. A few changes are to reduce compiler
warnings, while one (in cmd_merge) is a bug fix.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The --branch-log option and its associated code hasn't been used in
several months, as its not really very useful for debugging fast-import
or a frontend. I don't plan on supporting it in this state long-term,
so I'm killing it now before it gets distributed to a wider audience.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is a first pass at the manpage for git-fast-import.
I have tried to cover the input format in extreme detail, creating a
reference which is more detailed than the BNF grammar appearing in
the header of fast-import.c. I have also covered some details about
gfi's performance and memory utilization, as well as the average
learning curve required to create a gfi frontend application (as it
is far lower than it might appear on first glance).
The documentation still lacks real example input streams, which may
turn out to be difficult to format in asciidoc due to the blank lines
which carry meaning within the format.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The current implementation of shell-style quoted refnames and
SHA-1 expressions within fast-import contains a bad memory leak.
We leak the unquoted strings used by the `from` and `merge`
commands, maybe others. Its also just muddling up the docs.
Since Git refnames cannot contain LF, and that is our delimiter
for the end of the refname, and we accept any other character
as-is, there is no reason for these strings to support quoting,
except to be nice to frontends. But frontends shouldn't be
expecting to use funny refs in Git, and its just as simple to
never quote them as it is to always pass them through the same
quoting filter as pathnames. So frontends should never quote
refs, or ref expressions.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some structs are allocated rather frequently, but were using integer
types which were far larger than required to actually store their
full value range.
As packfiles are limited to 4 GiB we don't need more than 32 bits to
store the offset of an object within that packfile, an `unsigned long`
on a 64 bit system is likely a 64 bit unsigned value. Saving 4 bytes
per object on a 64 bit system can add up fast on any sizable import.
As atom strings are strictly single components in a path name these
are probably limited to just 255 bytes by the underlying OS. Going
to that short of a string is probably too restrictive, but certainly
`unsigned int` is far too large for their lengths. `unsigned short`
is a reasonable limit.
Modes within a tree really only need two bytes to store their whole
value; using `unsigned int` here is vast overkill. Saving 4 bytes
per file entry in an active branch can add up quickly on a project
with a large number of files.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This command isn't encouraged (as its slow) but it does exist and
is accepted, so it still should be covered in the BNF.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
git-fast-import requires use of inttypes.h, but the master branch has
added it to git-compat-util differently than git-fast-import originally
had used it. This merge back of master to the fast-import topic is to
get (and use) inttypes.h the way master is using it.
This is a partially evil merge to remove the call to setup_ident(),
as the master branch now contains a change which makes this implicit
and therefore removed the function declaration. (commit 01754769).
Conflicts:
git-compat-util.h
Fix blameview to use git-cat-file to read the file content.
This make sure we show the right content when we have modified
file in the working directory which is not committed.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
... with:
$ git fetch ${remote} HEAD
Also
$ git fetch ${remote} :${localref}
worked, but
$ git fetch ${remote} HEAD:{localref}
didn't. Now both are equivalent.
Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
rfc2047 unquoter spitted out an annoying "- unquoted" which was
added during debugging but I forgot to remove.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The earlier change df391b192 to rename fsck-objects to fsck broke
fsck-objects. This should fix it again.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to use lock_any_ref_for_update() because the command
needs to also update HEAD (which is not under refs/, so
lock_ref_sha1() cannot be used). The function however did not
check for refs with illegal characters in them.
Use check_ref_format() to catch malformed refs. For this check,
we specifically do not want to say having less than two levels
in the name is illegal to allow HEAD (and perhaps other special
refs in the future).
Signed-off-by: Junio C Hamano <junkio@cox.net>
I know it's only an example, but having this might save someone else the
trouble of writing an enhanced version for themselves.
It basically does the same job as the old update hook, but with these
differences:
* The recipients list is read from the repository config file from
hooks.mailinglist
* Updating unannotated tags can be allowed by setting
hooks.allowunannotated
* Announcement emails (via annotated tag creation) can be sent to a
different mailing list by setting hooks.announcelist
* Output email is more verbose and generates specific content depending
on whether the ref is a tag, an annotated tag, a branch, or a
tracking branch
* The email is easier to filter; the subject line is prefixed with
[SCM] and a project description pulled from the "description" file
* It catches (and displays differently) branch updates that are
performed with a --force
Obviously, it's nothing that clever - it's the update hook I use on my
repositories but I've tried to keep it general, and tried to make the
output always relevant to the type of update.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
I got bitten because in the UK (where one would expect 1970-01-01 00:00
to be UTC 0) some politicians decided to mess around with daylight
savings time from 1968 to 1971; it was permanently BST (+0100). That
means that on my computer the following is true:
$ date --date="1970-01-01 00:00" +"%F %T %z (%Z)"
1970-01-01 00:00:00 +0100 (BST)
This of course means that the --date argument to date is specified in
local time, not UTC. So when the hooks--update script does this:
date=$(date --date="1970-01-01 00:00:00 $ts seconds")
It's actually saying (in my timezone) "1970-01-01 01:00:00 UTC" + $ts.
Clearly this is wrong. The UNIX epoch started at midnight UTC not 1am
UTC.
This leads to the tagged time in hooks--update being shown as one hour
earlier than the true tagged time (in my timezone). The problem would
be worse for other timezones. For a +1300 timezone on 1970-01-01, the
tagged time would be 13 hours earlier. Oops.
The solution is to force the reference time to UTC, which is what this
patch does. In my timezone:
$ date --date="1970-01-01 00:00 +0000" +"%F %T %z (%Z)"
1970-01-01 01:00:00 +0100 (BST)
Much better.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Love it or hate it, some people actually still program in Tcl. Some
of those programs are meant for interfacing with Git. Programs such as
gitk and git-gui. It may be useful to have Tcl-safe output available
from for-each-ref, just like shell, Perl and Python already enjoy.
Thanks to Sergey Vlasov for pointing out the horrible flaws in the
first and second version of this patch, and steering me in the right
direction for Tcl value quoting.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows pushing over the git:// protocol, and while it's not
authenticated, it could make sense from within a firewalled
setup where nobody but trusted internal people can reach the git
port. git-daemon is possibly easier and faster to set up in the
kind of situation where you set up git instead of CVS inside a
company.
"git-receive-pack" is disabled by default, so you need to enable it
explicitly by starting git-daemon with the "--enable=receive-pack"
command line argument, or by having your config enable it automatically.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* The description of valid colour specifications was rather
incomplete, so fix it so that it actually describes colour specs as
accepted by color_parse().
* The list of colour items allowed in color.diff.BLAH was missing the
`commit' and `whitespace' entries.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Starting a pager defeats the purpose of the incremental output
mode. This changes git-blame to only paginate if --incremental
was not given.
git -p blame --incremental still starts the pager, though.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
My prior change to git-describe attempts to print the distance
between the input commit and the best matching tag, but this distance
was usually only an estimate as we always aborted revision walking
as soon as we overflowed the configured limit on the number of
possible tags (as set by --candidates).
Displaying an estimated distance is not very useful and can just be
downright confusing. Most users (heck, most Git developers) don't
immediately understand why this distance differs from the output
of common tools such as `git rev-list | wc -l`. Even worse, the
estimated distance could change in the future (including decreasing
despite no rebase occuring) if we find more possible tags earlier
on during traversal. (This could happen if more tags are merged
into the branch between queries.) These factors basically make an
estimated distance useless.
Fortunately we are usually most of the way through an accurate
distance computation by the time we abort (due to reaching the
current --candidates limit). This means we can simply finish
counting out the revisions still in our commit queue to present
the accurate distance at the end. The number of commits remaining
in the commit queue is probably less than the number of commits
already traversed, so finishing out the count is not likely to take
very long. This final distance will then always match the output of
`git rev-list | wc -l`.
We can easily reduce the total number of commits that need to be
walked at the end by stopping as soon as all of the commits in the
commit queue are flagged as being merged into the already selected
best possible tag. If that's true then there are no remaining
unseen commits which can contribute to our best possible tag's
depth counter, so further traversal is useless.
Basic testing on my Mac OS X system shows there is no noticable
performance difference between this accurate distance counting
version of git-describe and the prior version of git-describe,
at least when run on git.git.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If you get two different describes at different
times from a non-rewinding branch and they both come up with the same
tag name, you can tell which is the 'newer' one by distance. This is
rather common in practice, so its incredibly useful.
[jc: still needs documentation and fixups when traversal gives up
early.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Otherwise a pathname that has funny characters such as LF would
screw up the parsing programs of the output.
Strictly speaking, this is not backward compatible, but the
current output for pathnames that have embedded LF and such
cannot be sanely parsed anyway, and pathnames that only use
characters from the portable pathname character set won't be
affected.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds --incremental option to help GUI porcelains to show
the result from git-blame incrementally. The output gives the
origin information in the same format as the porcelain format.
The first line has commit object name, the line number of the
first line in the group in the original file, the line number of
that file in the final image, and number of lines in the group.
Then subsequent lines show the metainformation for the commit
when the commit is shown for the first time, except the filename
information is always shown (we cannot even make it conditional
to -C option as blame always follows the renaming of the file
wholesale).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Back when only handful commands that created commit and tag were
the only users of committer identity information, it made sense
to explicitly call setup_ident() to pre-fill the default value
from the gecos information. But it is much simpler for programs
to make the call automatic when get_ident() is called these days,
since many more programs want to use the information when updating
the reflog.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In the context of reflog output the reflog message is more useful than
the commit message's first line. When relevant the reflog message
will contain that line anyway.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>