filter-branch suffers from a deluge of disguised dangers that disfigure
history rewrites (i.e. deviate from the deliberate changes). Many of
these problems are unobtrusive and can easily go undiscovered until the
new repository is in use. This can result in problems ranging from an
even messier history than what led folks to filter-branch in the first
place, to data loss or corruption. These issues cannot be backward
compatibly fixed, so add a warning to both filter-branch and its manpage
recommending that another tool (such as filter-repo) be used instead.
Also, update other manpages that referenced filter-branch. Several of
these needed updates even if we could continue recommending
filter-branch, either due to implying that something was unique to
filter-branch when it applied more generally to all history rewriting
tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because
something about filter-branch was used as an example despite other more
commonly known examples now existing. Reword these sections to fix
these issues and to avoid recommending filter-branch.
Finally, remove the section explaining BFG Repo Cleaner as an
alternative to filter-branch. I feel somewhat bad about this,
especially since I feel like I learned so much from BFG that I put to
good use in filter-repo (which is much more than I can say for
filter-branch), but keeping that section presented a few problems:
* In order to recommend that people quit using filter-branch, we need
to provide them a recomendation for something else to use that
can handle all the same types of rewrites. To my knowledge,
filter-repo is the only such tool. So it needs to be mentioned.
* I don't want to give conflicting recommendations to users
* If we recommend two tools, we shouldn't expect users to learn both
and pick which one to use; we should explain which problems one
can solve that the other can't or when one is much faster than
the other.
* BFG and filter-repo have similar performance
* All filtering types that BFG can do, filter-repo can also do. In
fact, filter-repo comes with a reimplementation of BFG named
bfg-ish which provides the same user-interface as BFG but with
several bugfixes and new features that are hard to implement in
BFG due to its technical underpinnings.
While I could still mention both tools, it seems like I would need to
provide some kind of comparison and I would ultimately just say that
filter-repo can do everything BFG can, so ultimately it seems that it
is just better to remove that section altogether.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inspired by 21416f0a07 ("restore: fix typo in docs", 2019-08-03), I ran
"git grep -E '(\b[a-zA-Z]+) \1\b' -- Documentation/" to find other cases
where words were duplicated, e.g. "the the", and in most cases removed
one of the repeated words.
There were many false positives by this grep command, including
deliberate repeated words like "really really" or valid uses of "that
that" which I left alone, of course.
I also did not correct any of the legitimate, accidentally repeated
words in old RelNotes.
Signed-off-by: Mark Rushakoff <mark.rushakoff@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Automatic re-encoding of commit messages (and dropping of the encoding
header) hurts attempts to do reversible history rewrites (e.g. sha1sum
<-> sha256sum transitions, some subtree rewrites), and seems
inconsistent with the general principle followed elsewhere in
fast-export of requiring explicit user requests to modify the output
(e.g. --signed-tags=strip, --tag-of-filtered-object=rewrite). Add a
--reencode flag that the user can use to specify, and like other
fast-export flags, default it to 'abort'.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Knowing the original names (hashes) of commits can sometimes enable
post-filtering that would otherwise be difficult or impossible. In
particular, the desire to rewrite commit messages which refer to other
prior commits (on top of whatever other filtering is being done) is
very difficult without knowing the original names of each commit.
In addition, knowing the original names (hashes) of blobs can allow
filtering by blob-id without requiring re-hashing the content of the
blob, and is thus useful as a small optimization.
Once we add original ids for both commits and blobs, we may as well
add them for tags too for completeness. Perhaps someone will have a
use for them.
This commit teaches a new --show-original-ids option to fast-export
which will make it add a 'original-oid <hash>' line to blob, commits,
and tags. It also teaches fast-import to parse (and ignore) such
lines.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git filter-branch has a nifty feature allowing you to rewrite, e.g. just
the last 8 commits of a linear history
git filter-branch $OPTIONS HEAD~8..HEAD
If you try the same with git fast-export, you instead get a history of
only 8 commits, with HEAD~7 being rewritten into a root commit. There
are two alternatives:
1) Don't use the negative revision specification, and when you're
filtering the output to make modifications to the last 8 commits,
just be careful to not modify any earlier commits somehow.
2) First run 'git fast-export --export-marks=somefile HEAD~8', then
run 'git fast-export --import-marks=somefile HEAD~8..HEAD'.
Both are more error prone than I'd like (the first for obvious reasons;
with the second option I have sometimes accidentally included too many
revisions in the first command and then found that the corresponding
extra revisions were not exported by the second command and thus were
not modified as I expected). Also, both are poor from a performance
perspective.
Add a new --reference-excluded-parents option which will cause
fast-export to refer to commits outside the specified rev-list-args
range by their sha1sum. Such a stream will only be useful in a
repository which already contains the necessary commits (much like the
restriction imposed when using --no-data).
Note from Peff:
I think we might be able to do a little more optimization here. If
we're exporting HEAD^..HEAD and there's an object in HEAD^ which is
unchanged in HEAD, I think we'd still print it (because it would not
be marked SHOWN), but we could omit it (by walking the tree of the
boundary commits and marking them shown). I don't think it's a
blocker for what you're doing here, but just a possible future
optimization.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The standard for command documentation synopses appears to be:
[...] means optional
<...> means replaceable
[<...>] means both optional and replaceable
So fix a number of doc pages that use incorrect variations of the
above.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When formatted as a man page, 1st section header is always in upper
case even if we write it otherwise. Make all 1st section headers
uppercase to keep it close to the final output.
This does affect html since case is kept there, but I still think it's
a good idea to maintain a consistent style for 1st section headers.
Some sections perhaps should become second sections instead, where
case is kept, and for better organization. I will update if anyone has
suggestions about this.
While at there I also make some header more consistent (e.g. examples
vs example) and fix a couple minor things here and there.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Older versions of AsciiDoc would convert the "--" in
"--option" into an emdash. According to 565e135
(Documentation: quote double-dash for AsciiDoc, 2011-06-29),
this is fixed in AsciiDoc 8.3.0. According to bf17126, we
don't support anything older than 8.4.1 anyway, so we no
longer need to worry about quoting.
Even though this does not change the output at all, there
are a few good reasons to drop the quoting:
1. It makes the source prettier to read.
2. We don't quote consistently, which may be confusing when
reading the source.
3. Asciidoctor does not like the quoting, and renders a
literal backslash.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In particular, git-fast-import and -export link to each
other, and gitremote-helpers links to existing remote
helpers, and vice versa. Also link to fast-import from the
remote helper spec, as this is relevant for remote helpers
using the fast-import format.
Signed-off-by: Max Horn <max@quendi.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original commit made mention of this option, but not why
one might want it or how they might use it. Let's try to be
a little more thorough, and also explain how to confirm that
the output really is anonymous.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes users want to report a bug they experience on
their repository, but they are not at liberty to share the
contents of the repository. It would be useful if they could
produce a repository that has a similar shape to its history
and tree, but without leaking any information. This
"anonymized" repository could then be shared with developers
(assuming it still replicates the original problem).
This patch implements an "--anonymize" option to
fast-export, which generates a stream that can recreate such
a repository. Producing a single stream makes it easy for
the caller to verify that they are not leaking any useful
information. You can get an overview of what will be shared
by running a command like:
git fast-export --anonymize --all |
perl -pe 's/\d+/X/g' |
sort -u |
less
which will show every unique line we generate, modulo any
numbers (each anonymized token is assigned a number, like
"User 0", and we replace it consistently in the output).
In addition to anonymizing, this produces test cases that
are relatively small (compared to the original repository)
and fast to generate (compared to using filter-branch, or
modifying the output of fast-export yourself). Here are
numbers for git.git:
$ time git fast-export --anonymize --all \
--tag-of-filtered-object=drop >output
real 0m2.883s
user 0m2.828s
sys 0m0.052s
$ gzip output
$ ls -lh output.gz | awk '{print $5}'
2.9M
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So that we can convert the exported ref names.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 3.x tree has been out for a while now. The -2.6 repository name
survived the initial release [1], but kernel.org now only lists
'linux.git' (for aegl as well as torvalds) [2].
[1]: http://article.gmane.org/gmane.linux.kernel/1147422
On 2011-05-30 01:47:57 GMT, Linus Torvalds wrote:
> ... yes, that means that my git tree is still called
> "linux-2.6.git" on kernel.org.
[2]: http://git.kernel.org/cgit/
Signed-off-by: W. Trevor King <wking@tremily.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
White-spaces, missing braces, standardize --[no-]foo.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This issues a warning while stripping signatures from signed tags, which
allows us to use it as default behaviour for remote helpers which cannot
specify how to handle signed tags.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export can fail because of some pruned-reference when importing a
mark file.
The problem happens in the following scenario:
$ git fast-export --export-marks=MARKS master
(rewrite master)
$ git prune
$ git fast-export --import-marks=MARKS master
This might fail if some references have been removed by prune
because some marks will refer to no longer existing commits.
git-fast-export will not need these objects anyway as they were no
longer reachable.
We still need to update last_numid so we don't change the mapping
between marks and objects for remote-helpers.
Unfortunately, the mark file should not be rewritten without lost marks
if no new objects has been exported, as we could lose track of the last
last_numid.
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Reviewed-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In asciidoc 7, backticks like `foo` produced a typographic
effect, but did not otherwise affect the syntax. In asciidoc
8, backticks introduce an "inline literal" inside which markup
is not interpreted. To keep compatibility with existing
documents, asciidoc 8 has a "no-inline-literal" attribute to
keep the old behavior. We enabled this so that the
documentation could be built on either version.
It has been several years now, and asciidoc 7 is no longer
in wide use. We can now decide whether or not we want
inline literals on their own merits, which are:
1. The source is much easier to read when the literal
contains punctuation. You can use `master~1` instead
of `master{tilde}1`.
2. They are less error-prone. Because of point (1), we
tend to make mistakes and forget the extra layer of
quoting.
This patch removes the no-inline-literal attribute from the
Makefile and converts every use of backticks in the
documentation to an inline literal (they must be cleaned up,
or the example above would literally show "{tilde}" in the
output).
Problematic sites were found by grepping for '`.*[{\\]' and
examined and fixed manually. The results were then verified
by comparing the output of "html2text" on the set of
generated html pages. Doing so revealed that in addition to
making the source more readable, this patch fixes several
formatting bugs:
- HTML rendering used the ellipsis character instead of
literal "..." in code examples (like "git log A...B")
- some code examples used the right-arrow character
instead of '->' because they failed to quote
- api-config.txt did not quote tilde, and the resulting
HTML contained a bogus snippet like:
<tt><sub></tt> foo <tt></sub>bar</tt>
which caused some parsers to choke and omit whole
sections of the page.
- git-commit.txt confused ``foo`` (backticks inside a
literal) with ``foo'' (matched double-quotes)
- mentions of `A U Thor <author@example.com>` used to
erroneously auto-generate a mailto footnote for
author@example.com
- the description of --word-diff=plain incorrectly showed
the output as "[-removed-] and {added}", not "{+added+}".
- using "prime" notation like:
commit `C` and its replacement `C'`
confused asciidoc into thinking that everything between
the first backtick and the final apostrophe were meant
to be inside matched quotes
- asciidoc got confused by the escaping of some of our
asterisks. In particular,
`credential.\*` and `credential.<url>.\*`
properly escaped the asterisk in the first case, but
literally passed through the backslash in the second
case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sr/transport-helper-fix: (21 commits)
transport-helper: die early on encountering deleted refs
transport-helper: implement marks location as capability
transport-helper: Use capname for refspec capability too
transport-helper: change import semantics
transport-helper: update ref status after push with export
transport-helper: use the new done feature where possible
transport-helper: check status code of finish_command
transport-helper: factor out push_update_refs_status
fast-export: support done feature
fast-import: introduce 'done' command
git-remote-testgit: fix error handling
git-remote-testgit: only push for non-local repositories
remote-curl: accept empty line as terminator
remote-helpers: export GIT_DIR variable to helpers
git_remote_helpers: push all refs during a non-local export
transport-helper: don't feed bogus refs to export push
git-remote-testgit: import non-HEAD refs
t5800: document some non-functional parts of remote helpers
t5800: use skip_all instead of prereq
t5800: factor out some ref tests
...
If fast-export is being used to generate a fast-import stream that
will be used afterwards it is desirable to indicate the end of the
stream with the new 'done' command.
Add a flag that causes fast-export to end with 'done'.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The SYNOPSIS sections of most commands that span several lines already
use [verse] to retain line breaks. Most commands that don't span
several lines seem not to use [verse]. In the HTML output, [verse]
does not only preserve line breaks, but also makes the section
indented, which causes a slight inconsistency between commands that
use [verse] and those that don't. Use [verse] in all SYNOPSIS sections
for consistency.
Also remove the blank lines from git-fetch.txt and git-rebase.txt to
align with the other man pages. In the case of git-rebase.txt, which
already uses [verse], the blank line makes the [verse] not apply to
the last line, so removing the blank line also makes the formatting
within the document more consistent.
While at it, add single quotes to 'git cvsimport' for consistency with
other commands.
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of these sections is generally to:
1. Give credit where it is due.
2. Give the reader an idea of where to ask questions or
file bug reports.
But they don't do a good job of either case. For (1), they
are out of date and incomplete. A much more accurate answer
can be gotten through shortlog or blame. For (2), the
correct contact point is generally git@vger, and even if you
wanted to cc the contact point, the out-of-date and
incomplete fields mean you're likely sending to somebody
useless.
So let's drop the fields entirely from all manpages except
git(1) itself. We already point people to the mailing list
for bug reports there, and we can update the Authors section
to give credit to the major contributors and point to
shortlog and blame for more information.
Each page has a "This is part of git" footer, so people can
follow that to the main git manpage.
Use the {tilde} entity to get a literal tilde without fuss.
With \~, asciidoc 8.5.2 (and probably earlier versions) keeps the
backslash in the output.
Reported-by: Frédéric Brière <fbriere@fbriere.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option adds symmetry with fast-import, enabling it to also work with
complete trees instead of just incremental changes. It works by issuing a
'deleteall' directive with each commit and then listing the full set of
files that make up that commit, rather than just showing the list of files
that have changed since the (first) parent commit. Note that this
functionality is automatically turned on when using --import-marks together
with path limiting in order to avoid dropping important but unchanged
files.
This functionality is desired when using hand-written filters along with
'fast-export | some-filter | fast-import' as it can be easier to write
<some-filter> in terms of complete trees than incremental changes.
We could avoid the need to add this option by simply always turning it on.
While the end result would be identical, it would slow things down slightly
by printing many more filenames per commit which goes somewhat against the
'fast' in 'fast-export'.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation was quite inconsistent when spelling 'git cmd' if it
only refers to the program, not to some specific invocation syntax:
both 'git-cmd' and 'git cmd' spellings exist.
The current trend goes towards dashless forms, and there is precedent
in 647ac70 (git-svn.txt: stop using dash-form of commands.,
2009-07-07) to actively eliminate the dashed variants.
Replace 'git-cmd' with 'git cmd' throughout, except where git-shell,
git-cvsserver, git-upload-pack, git-receive-pack, and
git-upload-archive are concerned, because those really live in the
$PATH.
When using git fast-export and git fast-import to rewrite the history
of a repository with large binary files, almost all of the time is
spent dealing with blobs. This is extremely inefficient if all we want
to do is rewrite the commits and tree structure. --no-data skips the
output of blobs and writes SHA-1s instead of marks, which provides a
massive speedup.
Signed-off-by: Geoffrey Irving <irving@naml.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When providing a list of paths to limit what is exported, the object that
a tag points to can be filtered out entirely. This new switch allows
the user to specify what should happen to the tag in such a case. The
default action, 'abort' will exit with an error message. With 'drop', the
tag will simply be omitted from the output. With 'rewrite', if the object
tagged was a commit, the tag will be modified to tag an alternate commit.
The alternate commit is determined by treating the original commit as the
"parent" of the tag and then using the parent rewriting algorithm of the
revision traversal machinery (related to the "--parents" option of "git
rev-list")
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When no tagger was found (old Git produced tags like this),
no "tagger" line is printed (but this is incompatible with the current
git fast-import).
Alternatively, you can pass the option --fake-missing-tagger, forcing
fast-export to fake a tagger
Unspecified Tagger <no-tagger>
with a tag date of the beginning of (Unix) time in the case of a missing
tagger, so that fast-import is still able to import the result.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve some minor language and format issues like hyphenation,
phrases, spacing, word order, comma, attributes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although it does not matter for Git itself, tools that
export to systems that explicitly track copies and
renames can benefit from such information.
This patch makes fast-export output correct action
logs when -M or -C are enabled.
Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The names of git commands are not meant to be entered at the
commandline; they are just names. So we render them in italics,
as is usual for command names in manpages.
Using
doit () {
perl -e 'for (<>) { s/\`(git-[^\`.]*)\`/'\''\1'\''/g; print }'
}
for i in git*.txt config.txt diff*.txt blame*.txt fetch*.txt i18n.txt \
merge*.txt pretty*.txt pull*.txt rev*.txt urls*.txt
do
doit <"$i" >"$i+" && mv "$i+" "$i"
done
git diff
.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following what appears to be the predominant style, format
names of commands and commandlines both as `teletype text`.
While we're at it, add articles ("a" and "the") in some
places, italicize the name of the command in the manual page
synopsis line, and add a comma or two where it seems appropriate.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the git-* commands are not installed in $(bindir), using
"git-command <parameters>" in examples in the documentation is
not a good idea. On the other hand, it is nice to be able to
refer to each command using one hyphenated word. (There is no
escaping it, anyway: man page names cannot have spaces in them.)
This patch retains the dash in naming an operation, command,
program, process, or action. Complete command lines that can
be entered at a shell (i.e., without options omitted) are
made to use the dashless form.
The changes consist only of replacing some spaces with hyphens
and vice versa. After a "s/ /-/g", the unpatched and patched
versions are identical.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the --import-marks and --export-marks to fast-export. These import
and export the marks used to for all revisions exported in a similar fashion
to what fast-import does. The format is the same as fast-import, so you can
create a bidirectional importer / exporter by using the same marks file on
both sides.
Signed-off-by: Pieter de Bie <pdebie@ai.rug.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the "git" man page describes the "git" command at the end-user
level, it seems better to move it to man section 1.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Between AsciiDoc 8.2.2 and 8.2.3, the following change was made to the stock
Asciidoc configuration:
@@ -149,7 +153,10 @@
# Inline macros.
# Backslash prefix required for escape processing.
# (?s) re flag for line spanning.
-(?su)[\\]?(?P<name>\w(\w|-)*?):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
+# Explicit so they can be nested.
+(?su)[\\]?(?P<name>(http|https|ftp|file|mailto|callto|image|link)):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
# Anchor: [[[id]]]. Bibliographic anchor.
(?su)[\\]?\[\[\[(?P<attrlist>[\w][\w-]*?)\]\]\]=anchor3
# Anchor: [[id,xreflabel]]
This default regex now matches explicit values, and unfortunately in this
case gitlink was being matched by just 'link', causing the wrong inline
macro template to be applied. By renaming the macro, we can avoid being
matched by the wrong regex.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name 'verbatim' describes much better what this mode does with
signed tags. While at it, fix the documentation what it actually
does.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This program dumps (parts of) a git repository in the format that
fast-import understands.
For clarity's sake, it does not use the 'inline' method of specifying
blobs in the commits, but builds the blobs before building the commits.
Since signed tags' signatures will not necessarily be valid (think
transformations after the export, or excluding revisions, changing
the history), there are 4 modes to handle them: abort (default),
ignore, warn and strip. The latter just turns the tags into
unsigned ones.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>