The git-repack command always passes `--honor-pack-keep`
to pack-objects. This has traditionally been a good thing,
as we do not want to duplicate those objects in a new pack,
and we are not going to delete the old pack.
However, when bitmaps are in use, it is important for a full
repack to include all reachable objects, even if they may be
duplicated in a .keep pack. Otherwise, we cannot generate
the bitmaps, as the on-disk format requires the set of
objects in the pack to be fully closed.
Even if the repository does not generally have .keep files,
a simultaneous push could cause a race condition in which a
.keep file exists at the moment of a repack. The repack may
try to include those objects in one of two situations:
1. The pushed .keep pack contains objects that were
already in the repository (e.g., blobs due to a revert of
an old commit).
2. Receive-pack updates the refs, making the objects
reachable, but before it removes the .keep file, the
repack runs.
In either case, we may prefer to duplicate some objects in
the new, full pack, and let the next repack (after the .keep
file is cleaned up) take care of removing them.
This patch introduces both a command-line and config option
to disable the `--honor-pack-keep` option. By default, it
is triggered when pack.writeBitmaps (or `--write-bitmap-index`
is turned on), but specifying it explicitly can override the
behavior (e.g., in cases where you prefer .keep files to
bitmaps, but only when they are present).
Note that this option just disables the pack-objects
behavior. We still leave packs with a .keep in place, as we
do not necessarily know that we have duplicated all of their
objects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Borrow the bitmap index into packfiles from JGit to speed up
enumeration of objects involved in a commit range without having to
fully traverse the history.
* jk/pack-bitmap: (26 commits)
ewah: unconditionally ntohll ewah data
ewah: support platforms that require aligned reads
read-cache: use get_be32 instead of hand-rolled ntoh_l
block-sha1: factor out get_be and put_be wrappers
do not discard revindex when re-preparing packfiles
pack-bitmap: implement optional name_hash cache
t/perf: add tests for pack bitmaps
t: add basic bitmap functionality tests
count-objects: recognize .bitmap in garbage-checking
repack: consider bitmaps when performing repacks
repack: handle optional files created by pack-objects
repack: turn exts array into array-of-struct
repack: stop using magic number for ARRAY_SIZE(exts)
pack-objects: implement bitmap writing
rev-list: add bitmap mode to speed up object lists
pack-objects: use bitmaps when packing objects
pack-objects: split add_object_entry
pack-bitmap: add support for bitmap indexes
documentation: add documentation for the bitmap format
ewah: compressed bitmap implementation
...
Since `pack-objects` will write a `.bitmap` file next to the `.pack` and
`.idx` files, this commit teaches `git-repack` to consider the new
bitmap indexes (if they exist) when performing repack operations.
This implies moving old bitmap indexes out of the way if we are
repacking a repository that already has them, and moving the newly
generated bitmap indexes into the `objects/pack` directory, next to
their corresponding packfiles.
Since `git repack` is now capable of handling these `.bitmap` files,
a normal `git gc` run on a repository that has `pack.writebitmaps` set
to true in its config file will generate bitmap indexes as part of the
garbage collection process.
Alternatively, `git repack` can be called with the `-b` switch to
explicitly generate bitmap indexes if you are experimenting
and don't want them on all the time.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This updates the documentation regarding the changes introduced
by a1bbc6c01 (2013-09-15, repack: rewrite the shell script in C).
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default output from "fsck" is often overwhelmed by informational
message on dangling objects, especially if you do not repack often, and a
real error can easily be buried.
Add "--no-dangling" option to omit them, and update the user manual to
demonstrate its use.
Based on a patch by Clemens Buchacher, but reverted the part to change
the default to --no-dangling, which is unsuitable for the first patch.
The usual three-step procedure to break the backward compatibility over
time needs to happen on top of this, if we were to go in that direction.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The SYNOPSIS sections of most commands that span several lines already
use [verse] to retain line breaks. Most commands that don't span
several lines seem not to use [verse]. In the HTML output, [verse]
does not only preserve line breaks, but also makes the section
indented, which causes a slight inconsistency between commands that
use [verse] and those that don't. Use [verse] in all SYNOPSIS sections
for consistency.
Also remove the blank lines from git-fetch.txt and git-rebase.txt to
align with the other man pages. In the case of git-rebase.txt, which
already uses [verse], the blank line makes the [verse] not apply to
the last line, so removing the blank line also makes the formatting
within the document more consistent.
While at it, add single quotes to 'git cvsimport' for consistency with
other commands.
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of these sections is generally to:
1. Give credit where it is due.
2. Give the reader an idea of where to ask questions or
file bug reports.
But they don't do a good job of either case. For (1), they
are out of date and incomplete. A much more accurate answer
can be gotten through shortlog or blame. For (2), the
correct contact point is generally git@vger, and even if you
wanted to cc the contact point, the out-of-date and
incomplete fields mean you're likely sending to somebody
useless.
So let's drop the fields entirely from all manpages except
git(1) itself. We already point people to the mailing list
for bug reports there, and we can update the Authors section
to give credit to the major contributors and point to
shortlog and blame for more information.
Each page has a "This is part of git" footer, so people can
follow that to the main git manpage.
In 479b56ba ('make "repack -f" imply "pack-objects --no-reuse-object"'),
git repack -f was changed to include recompressing all objects on the
zlib level on the assumption that if the user wants to spend that much
time already, some more time won't hurt (and recompressing is useful if
the user changed the zlib compression level).
However, "some more time" can be quite long with very big repositories,
so some users are going to appreciate being able to choose. If we are
going to give them the choice, --no-reuse-object will probably be
interesting a lot less frequently than --no-reuse-delta. Hence, this
reverts -f to the old behaviour (--no-reuse-delta) and adds a new -F
option that replaces the current -f.
Measurements taken using this patch on a current clone of git.git
indicate a 17% decrease in time being made available to users:
git repack -Adf 34.84s user 0.56s system 145% cpu 24.388 total
git repack -AdF 38.79s user 0.56s system 133% cpu 29.394 total
Signed-off-by: Jan Krüger <jk@jk.gs>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This default for repack.UseDeltaBaseOffset has been "true" since
Git v1.6.0.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The value passed to --max-pack-size used to count in MiB which was
inconsistent with the corresponding configuration variable as well as
other command arguments which are defined to count in bytes with an
optional unit suffix. This brings --max-pack-size in line with the
rest of Git.
Also, in order not to cause havoc with people used to the previous
megabyte scale, and because this is a sane thing to do anyway, a
minimum size of 1 MiB is enforced to avoid an explosion of pack files.
Adjust and extend test suite accordingly.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation was quite inconsistent when spelling 'git cmd' if it
only refers to the program, not to some specific invocation syntax:
both 'git-cmd' and 'git cmd' spellings exist.
The current trend goes towards dashless forms, and there is precedent
in 647ac70 (git-svn.txt: stop using dash-form of commands.,
2009-07-07) to actively eliminate the dashed variants.
Replace 'git-cmd' with 'git cmd' throughout, except where git-shell,
git-cvsserver, git-upload-pack, git-receive-pack, and
git-upload-archive are concerned, because those really live in the
$PATH.
The current text makes some users feel uneasy, worrying whether
'-a' could lead to corrupt repositories. Clarify that '-a'
may lead to performance issues only for dumb protocols.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Helped-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The -A option calls pack-objects with the --unpack-unreachable option so
that the unreachable objects in local packs are left in the local object
store loose. But if the -d option to repack was _not_ used, then these
unpacked loose objects are redundant and unnecessary.
Update tests in t7701.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The names of git commands are not meant to be entered at the
commandline; they are just names. So we render them in italics,
as is usual for command names in manpages.
Using
doit () {
perl -e 'for (<>) { s/\`(git-[^\`.]*)\`/'\''\1'\''/g; print }'
}
for i in git*.txt config.txt diff*.txt blame*.txt fetch*.txt i18n.txt \
merge*.txt pretty*.txt pull*.txt rev*.txt urls*.txt
do
doit <"$i" >"$i+" && mv "$i+" "$i"
done
git diff
.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following what appears to be the predominant style, format
names of commands and commandlines both as `teletype text`.
While we're at it, add articles ("a" and "the") in some
places, italicize the name of the command in the manual page
synopsis line, and add a comma or two where it seems appropriate.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the git-* commands are not installed in $(bindir), using
"git-command <parameters>" in examples in the documentation is
not a good idea. On the other hand, it is nice to be able to
refer to each command using one hyphenated word. (There is no
escaping it, anyway: man page names cannot have spaces in them.)
This patch retains the dash in naming an operation, command,
program, process, or action. Complete command lines that can
be entered at a shell (i.e., without options omitted) are
made to use the dashless form.
The changes consist only of replacing some spaces with hyphens
and vice versa. After a "s/ /-/g", the unpatched and patched
versions are identical.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change leading spaces to tabs to match the rest of the file.
Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The OPTIONS section of a documentation file contains a list
of the options a git command accepts.
Currently there are several variants to describe the case that
different options (almost) do the same in the OPTIONS section.
Some are:
-f, --foo::
-f|--foo::
-f | --foo::
But AsciiDoc has the special form:
-f::
--foo::
This patch applies this form to the documentation of the whole git suite,
and removes useless em-dash prevention, so \--foo becomes --foo.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the "git" man page describes the "git" command at the end-user
level, it seems better to move it to man section 1.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* bc/repack:
Documentation/git-repack.txt: document new -A behaviour
let pack-objects do the writing of unreachable objects as loose objects
add a force_object_loose() function
builtin-gc.c: deprecate --prune, it now really has no effect
git-gc: always use -A when manually repacking
repack: modify behavior of -A option to leave unreferenced objects unpacked
Conflicts:
builtin-pack-objects.c
While repacking a local repository a coworker thought the -n option
was necessary to git-repack to keep it from updating some unknown
file on the central server we all share. Explaining further what
the option is (not) doing helps to make it clear the option does
not impact any remote repositories the user may have configured.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add paragraph for the -A option, and describe the new behaviour
that makes unreachable objects loose.
Signed-off-by: Chris Frey <cdfrey@foursquare.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Between AsciiDoc 8.2.2 and 8.2.3, the following change was made to the stock
Asciidoc configuration:
@@ -149,7 +153,10 @@
# Inline macros.
# Backslash prefix required for escape processing.
# (?s) re flag for line spanning.
-(?su)[\\]?(?P<name>\w(\w|-)*?):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
+# Explicit so they can be nested.
+(?su)[\\]?(?P<name>(http|https|ftp|file|mailto|callto|image|link)):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])=
+
# Anchor: [[[id]]]. Bibliographic anchor.
(?su)[\\]?\[\[\[(?P<attrlist>[\w][\w-]*?)\]\]\]=anchor3
# Anchor: [[id,xreflabel]]
This default regex now matches explicit values, and unfortunately in this
case gitlink was being matched by just 'link', causing the wrong inline
macro template to be applied. By renaming the macro, we can avoid being
matched by the wrong regex.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some minor enhancements to the git-repack manual page.
Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --max-pack-size parsing and usage messages.
Upgrade git-repack.sh to handle multiple packfile names,
and build packfiles in GIT_OBJECT_DIRECTORY not GIT_DIR.
Update documentation.
Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* np/pack:
add the capability for index-pack to read from a stream
index-pack: compare only the first 20-bytes of the key.
git-repack: repo.usedeltabaseoffset
pack-objects: document --delta-base-offset option
allow delta data reuse even if base object is a preferred base
zap a debug remnant
let the GIT native protocol use offsets to delta base when possible
make pack data reuse compatible with both delta types
make git-pack-objects able to create deltas with offset to base
teach git-index-pack about deltas with offset to base
teach git-unpack-objects about deltas with offset to base
introduce delta objects with offset to base
When configuration variable `repack.UseDeltaBaseOffset` is set
for the repository, the command passes `--delta-base-offset`
option to `git-pack-objects`; this typically results in slightly
smaller packs, but the generated packs are incompatible with
versions of git older than (and including) v1.4.3.
We will make it default to true sometime in the future, but not
for a while.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently, you actually have to read the source to find out the
default values. While at it, fix two typos and suggest that these
options actually take a parameter in git-pack-objects.txt.
Signed-off-by: Dennis Stosberg <dennis@stosberg.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Move incorrect asciidoc level 2 titles back to level 1.
Show output of git-name-rev in man page example.
Reword sentences that begin with a period (.) in asciidoc
numbered lists to work around conversion to man page bug.
Mention that git-repack now calls git-prune-packed
when the -d option is passed to it.
[imap] section headers in the config file example need to be
contained in a literal block. imap.pass is the proper config
file variable to use, not imap.password.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
A new flag -q makes underlying pack-objects less chatty.
A new flag -f forces delta to be recomputed from scratch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds documentation for the -l and -n options to git-repack.
Signed-off-by: Nikolai Weibull <nikolai@bitwi.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The fixes focuses on improving the HTML output. Most noteworthy:
- Fix the Makefile to also make various *.html files depend on
included files.
- Consistently use 'NOTE: ...' instead of '[ ... ]' for additional
info.
- Fix ending '::' for description lists in OPTION section etc.
- Fix paragraphs in description lists ending up as preformated text.
- Always use listingblocks (preformatted text wrapped in lines with -----)
for examples that span empty lines, so they are put in only one HTML
block.
- Use '1.' instead of '(1)' for numbered lists.
- Fix linking to other GIT docs.
- git-rev-list.txt: put option descriptions in an OPTION section.
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The replacement was performed automatically by these commands:
perl -pi -e 's/link:(git.+)\.html\[\1\]/gitlink:$1\[1\]/g' \
README Documentation/*.txt
perl -pi -e 's/link:git\.html\[git\]/gitlink:git\[7\]/g' \
README Documentation/*.txt
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Junio C Hamano <junkio@cox.net>