When using git fast-export and git fast-import to rewrite the history
of a repository with large binary files, almost all of the time is
spent dealing with blobs. This is extremely inefficient if all we want
to do is rewrite the commits and tree structure. --no-data skips the
output of blobs and writes SHA-1s instead of marks, which provides a
massive speedup.
Signed-off-by: Geoffrey Irving <irving@naml.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* en/fast-export:
fast-export: Document the fact that git-rev-list arguments are accepted
Add new fast-export testcases
fast-export: Add a --tag-of-filtered-object option for newly dangling tags
fast-export: Do parent rewriting to avoid dropping relevant commits
fast-export: Make sure we show actual ref names instead of "(null)"
fast-export: Omit tags that tag trees
fast-export: Set revs.topo_order before calling setup_revisions
* maint:
SunOS grep does not understand -C<n> nor -e
Fix export_marks() error handling.
git branch: clean up detached branch handling
git branch: avoid unnecessary object lookups
git branch: fix performance problem
do_one_ref(): null_sha1 check is not about broken ref
Conflicts:
Makefile
- Don't leak one FILE * on error per export_marks() call. Found with
cppcheck and reported by Martin Ettl.
- Abort the potentially long for(;idnums.size;) loop on write errors.
- Record error if fprintf() fails for reasons not required to set the
stream error indicator, such as ENOMEM.
- Add a trailing full-stop to error message when fopen() fails.
Signed-off-by: Matthias Andree <matthias.andree@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When providing a list of paths to limit what is exported, the object that
a tag points to can be filtered out entirely. This new switch allows
the user to specify what should happen to the tag in such a case. The
default action, 'abort' will exit with an error message. With 'drop', the
tag will simply be omitted from the output. With 'rewrite', if the object
tagged was a commit, the tag will be modified to tag an alternate commit.
The alternate commit is determined by treating the original commit as the
"parent" of the tag and then using the parent rewriting algorithm of the
revision traversal machinery (related to the "--parents" option of "git
rev-list")
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When specifying paths to export, parent rewriting must be turned on for
fast-export to output anything at all.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code expects a ref name to be provided in commit->util. While there
was some code to set commit->util, it only worked in cases where there was
an unbroken chain of revisions from a ref to the relevant commit. In
cases such as running
git fast-export --parents master -- COPYING
commit->util would fail to be set. The old method of setting commit->util
has been removed in favor of requesting show_source from the revision
traversal machinery (related to the "--source" option of "git log" family
of commands.)
However, this change does not fix cases like
git fast export master~1
or
git fast export :/arguments
since in such cases commit->util will be "master~1" or ":/arguments" while
we need the actual ref (e.g. "refs/heads/master")
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit c0582c53bc introduced logic to just
omit tags that point to tree objects. However, these objects were still
being output and were pointing at "mark :0", which caused fast-import to
crash. This patch makes sure such tags (including deeper nestings such
as tags of tags of trees), are omitted.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
setup_revisions sets a variety of flags based on the setting of other
flags, such as setting the limited flag when topo_order is set. To avoid
circumventing any invariants created by setup_revisions, we set
revs.topo_order before calling it rather than after.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem. Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:
Function Passes on error from
-------- --------------------
odb_pack_keep open
read_ancestry fopen
read_in_full xread
strbuf_read xread
strbuf_read_file open or strbuf_read_file
strbuf_readlink readlink
write_in_full xwrite
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change calls to die(..., strerror(errno)) to use the new die_errno().
In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To give OPT_FILENAME the prefix, we pass the prefix to parse_options()
which passes the prefix to parse_options_start() which sets the prefix
member of parse_opts_ctx accordingly. If there isn't a prefix in the
calling context, passing NULL will suffice.
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ef/fast-export:
builtin-fast-export.c: handle nested tags
builtin-fast-export.c: fix crash on tagged trees
builtin-fast-export.c: turn error into warning
test-suite: adding a test for fast-export with tag variants
When tags that points to tags are passed to fast-export, an error is given,
saying "Tag [TAGNAME] points nowhere?". This fix calls parse_object() on the
object before referencing it's tag, to ensure the tag-info is fully initialized.
In addition, it inserts a comment to point out where nested tags are handled.
This is consistent with the comment for signed tags.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a tag object points to a tree (or another unhandled type), the commit-
pointer is left uninitialized and later dereferenced. This patch adds a
default case to the switch that issues a warning and skips the object.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import doesn't have a syntax to support tree-objects (and some other
object-types), so fast-export shouldn't handle them. However, aborting the
operation is a bit drastic. This patch turns the error into a warning instead.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When exporting a subset of commits on a branch that do not go back to a
root commit (e.g. master~2..master), we still want each exported commit to
have the same files in the exported tree as in the original tree.
Previously, when given such a range, we would omit master~2 as a parent of
master~1, but we would still diff against master~2 when selecting the list
of files to include in master~1. This would result in only files that
had changed in the given range showing up in the resulting export. In such
cases, we should diff master~1 against the root instead (i.e. use
diff_root_tree_sha1 instead of diff_tree_sha1).
There's a special case to consider here: incremental exports (i.e. exports
where the --import-marks flag is specified). If master~2 is an imported
mark, then we still want to diff master~1 against master~2 when selecting
the list of files to include.
We can handle all cases, including the special case, by just checking
whether master~2 corresponds to a known object mark when deciding what to
diff against.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.6.0:
Make repack less likely to corrupt repository
fast-export: ensure we traverse commits in topological order
Clear the delta base cache if a pack is rebuilt
fast-export will only list as parents those commits which have already
been traversed (making it appear as if merges have been squashed if not
all parents have been traversed). To avoid this silent squashing of
merge commits, we request commits in topological order.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When no tagger was found (old Git produced tags like this),
no "tagger" line is printed (but this is incompatible with the current
git fast-import).
Alternatively, you can pass the option --fake-missing-tagger, forcing
fast-export to fake a tagger
Unspecified Tagger <no-tagger>
with a tag date of the beginning of (Unix) time in the case of a missing
tagger, so that fast-import is still able to import the result.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The list extra_refs contains tags and the objects referenced by them,
so that they can be handled at the end. When a tag references a
commit, that commit is added to the list using the same name.
Also, the function handle_tags_and_duplicates() relies on the order
the items were added to extra_refs, so clearly we do not want to
use a sorted list here.
Noticed by Miklos Vajna.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Tested-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although it does not matter for Git itself, tools that
export to systems that explicitly track copies and
renames can benefit from such information.
This patch makes fast-export output correct action
logs when -M or -C are enabled.
Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name path_list was correct for the first usage of that data structure,
but it really is a general-purpose string list.
$ perl -i -pe 's/path-list/string-list/g' $(git grep -l path-list)
$ perl -i -pe 's/path_list/string_list/g' $(git grep -l path_list)
$ git mv path-list.h string-list.h
$ git mv path-list.c string-list.c
$ perl -i -pe 's/has_path/has_string/g' $(git grep -l has_path)
$ perl -i -pe 's/path/string/g' string-list.[ch]
$ git mv Documentation/technical/api-path-list.txt \
Documentation/technical/api-string-list.txt
$ perl -i -pe 's/strdup_paths/strdup_strings/g' $(git grep -l strdup_paths)
... and then fix all users of string-list to access the member "string"
instead of "path".
Documentation/technical/api-string-list.txt needed some rewrapping, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently fast-import/export cannot be used for
repositories with submodules. This patch extends
the relevant programs to make them correctly
process gitlinks.
Links can be represented by two forms of the
Modify command:
M 160000 SHA1 some/path
which sets the link target explicitly, or
M 160000 :mark some/path
where the mark refers to a commit. The latter
form can be used by importing tools to build
all submodules simultaneously in one physical
repository, and then simply fetch them apart.
Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form. So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.
This patch makes git commands show the dash-less version.
For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When printing valuds of type uint32_t, we should use PRIu32, and should
not assume that it is unsigned int. On 32-bit platforms, it could be
defined as unsigned long. The same caution applies to ntohl().
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The export_marks() function iterated over a (potentially sparsely
populated) hashtable, but it accessed it starting from offset 1 and one
element beyond the end.
Noticed by SungHyun Nam.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the --import-marks and --export-marks to fast-export. These import
and export the marks used to for all revisions exported in a similar fashion
to what fast-import does. The format is the same as fast-import, so you can
create a bidirectional importer / exporter by using the same marks file on
both sides.
Signed-off-by: Pieter de Bie <pdebie@ai.rug.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are exporting a commit which has no parents we may be doing
it to a branch that already exists, causing fast-import to assume
the branch's current revision should be the sole parent of the
new commit. This can cause `git fast-export | git fast-import`
to produce an incorrect graph for:
A-------M----o------o refs/heads/master
/
B-+
In this graph A and B are initial commits (no parents) but if A was
output first to refs/heads/master and then B is output fast-import
would assume the graph was this instead:
A-------M----o------o refs/heads/master
\ /
+-B-+
Which would cause B, M, and all later commits to have a different
SHA-1, and obviously be quite a different graph.
Sending a reset command prior to B informs fast-import to clear
the implied parent of A, allowing B to remain an initial commit.
Reported-by: Ben Lynn <benlynn@gmail.com>
Deemed-obviously-correct-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the git-fast-import man-page, you can only put a single
committish per merge: line, like this:
merge :10
merge :11
However, git-fast-export puts all parents on a single line, like this:
merge :10 :11
This changes fast-export to output a single parent per line. Otherwise
neither git-fast-import nor bzr-fast-import can read its output.
[jc: fix-up to remove excess LF in the output that makes fast-import barf]
Signed-off-by: Pieter de Bie <pdebie@ai.rug.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_config() only had a function parameter, but no callback data
parameter. This assumes that all callback functions only modify
global variables.
With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Documentation/git-am.txt: Pass -r in the example invocation of rm -f .dotest
timezone_names[]: fixed the tz offset for New Zealand.
filter-branch documentation: non-zero exit status in command abort the filter
rev-parse: fix potential bus error with --parseopt option spec handling
Use a single implementation and API for copy_file()
Documentation/git-filter-branch: add a new msg-filter example
Correct fast-export file mode strings to match fast-import standard
The fast-import file format does not expect leading '0' in front
of a file mode; that is we want '100644' and '0100644'.
Thanks to Ian Clatworthy of the Bazaar project for noticing the
difference in output/input.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change removes all obvious useless if-before-free tests.
E.g., it replaces code like this:
if (some_expression)
free (some_expression);
with the now-equivalent:
free (some_expression);
It is equivalent not just because POSIX has required free(NULL)
to work for a long time, but simply because it has worked for
so long that no reasonable porting target fails the test.
Here's some evidence from nearly 1.5 years ago:
http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html
FYI, the change below was prepared by running the following:
git ls-files -z | xargs -0 \
perl -0x3b -pi -e \
's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'
Note however, that it doesn't handle brace-enclosed blocks like
"if (x) { free (x); }". But that's ok, since there were none like
that in git sources.
Beware: if you do use the above snippet, note that it can
produce syntactically invalid C code. That happens when the
affected "if"-statement has a matching "else".
E.g., it would transform this
if (x)
free (x);
else
foo ();
into this:
free (x);
else
foo ();
There were none of those here, either.
If you're interested in automating detection of the useless
tests, you might like the useless-if-before-free script in gnulib:
[it *does* detect brace-enclosed free statements, and has a --name=S
option to make it detect free-like functions with different names]
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free
Addendum:
Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A failure in prepare_revision_walk can be caused by
a not parseable object.
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Writing 1 elements of size 0-byte successfully will cause fwrite(3) to
return 0, and flagging it as error is a mistake.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name 'verbatim' describes much better what this mode does with
signed tags. While at it, fix the documentation what it actually
does.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This program dumps (parts of) a git repository in the format that
fast-import understands.
For clarity's sake, it does not use the 'inline' method of specifying
blobs in the commits, but builds the blobs before building the commits.
Since signed tags' signatures will not necessarily be valid (think
transformations after the export, or excluding revisions, changing
the history), there are 4 modes to handle them: abort (default),
ignore, warn and strip. The latter just turns the tags into
unsigned ones.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>