"git prune" used to largely ignore broken refs when deciding which
objects are still being used, which could spread an existing small
damage and make it a larger one.
* jk/prune-with-corrupt-refs:
refs.c: drop curate_packed_refs
repack: turn on "ref paranoia" when doing a destructive repack
prune: turn on ref_paranoia flag
refs: introduce a "ref paranoia" flag
t5312: test object deletion code paths in a corrupted repository
The split-index mode introduced at v2.3.0-rc0~41 was broken in the
codepath to protect us against a broken reimplementation of Git
that writes an invalid index with duplicated index entries, etc.
* tg/fix-check-order-with-split-index:
read-cache: fix reading of split index
"git fetch" that fetches a commit using the allow-tip-sha1-in-want
extension could have failed to fetch all the requested refs.
* jk/fetch-pack:
fetch-pack: remove dead assignment to ref->new_sha1
fetch_refs_via_pack: free extra copy of refs
filter_ref: make a copy of extra "sought" entries
filter_ref: avoid overwriting ref->old_sha1 with garbage
An failure early in the "git clone" that started creating the
working tree and repository could have resulted in some directories
and files left without getting cleaned up.
* jk/cleanup-failed-clone:
clone: drop period from end of die_errno message
clone: initialize atexit cleanup handler earlier
Recommend format-patch and send-email for those who want to submit
patches to this project.
* jc/submitting-patches-mention-send-email:
SubmittingPatches: encourage users to use format-patch and send-email
"git log --graph --no-walk A B..." is a otcnflicting request that
asks nonsense; no-walk tells us show discrete points in the
history, while graph asks to draw connections between these
discrete points. Forbid the combination.
* dj/log-graph-with-no-walk:
revision: forbid combining --graph and --no-walk
"git rev-list --bisect --first-parent" does not work (yet) and can
even cause SEGV; forbid it. "git log --bisect --first-parent"
would not be useful until "git bisect --first-parent" materializes,
so it is also forbidden for now.
* kd/rev-list-bisect-first-parent:
rev-list: refuse --first-parent combined with --bisect
Even though "git grep --quiet" is run merely to ask for the exit
status, we spawned the pager regardless. Stop doing that.
* ws/grep-quiet-no-pager:
grep: fix "--quiet" overwriting current output
The prompt script (in contrib/) did not show the untracked sign
when working in a subdirectory without any untracked files.
* ct/prompt-untracked-fix:
git prompt: use toplevel to find untracked files
* 'master' of git://ozlabs.org/~paulus/gitk:
gitk: Update .po files
gitk: l10n: Add Catalan translation
gitk: Fix typo in Russian translation
gitk: Remove tcl-format flag from a message that shouldn't have it
gitk: Pass --invert-grep option down to "git log"
gitk: Synchronize config file writes
gitk: Report errors in saving config file
gitk: Only write changed configuration variables
gitk: Enable mouse horizontal scrolling in diff pane
gitk: Default wrcomcmd to use --pretty=email
The code that reads from the ctags file in the completion script
(in contrib/) did not spell ${param/pattern/string} substitution
correctly, which happened to work with bash but not with zsh.
* js/completion-ctags-pattern-substitution-fix:
contrib/completion: escape the forward slash in __git_match_ctag
Restructure "git push" codepath to make it easier to add new
configuration bits and then add push.followTags configuration that
turns --follow-tags option on by default.
* jk/push-config:
push: allow --follow-tags to be set by config push.followTags
cmd_push: pass "flags" pointer to config callback
cmd_push: set "atomic" bit directly
git_push_config: drop cargo-culted wt_status pointer
Test fixes.
* jk/test-annoyances:
t5551: make EXPENSIVE test cheaper
t5541: move run_with_cmdline_limit to test-lib.sh
t: pass GIT_TRACE through Apache
t: redirect stderr GIT_TRACE to descriptor 4
t: translate SIGINT to an exit
The transfer.hiderefs support did not quite work for smart-http
transport.
* jk/smart-http-hide-refs:
upload-pack: do not check NULL return of lookup_unknown_object
upload-pack: fix transfer.hiderefs over smart-http
"git tag -h" used to show the "--column" and "--sort" options
that are about listing in a wrong section.
* jk/tag-h-column-is-a-listing-option:
tag: fix some mis-organized options in "-h" listing
"git log --decorate" did not reset colors correctly around the
branch names.
* jc/decorate-leaky-separator-color:
log --decorate: do not leak "commit" color into the next item
Documentation/config.txt: simplify boolean description in the syntax section
Documentation/config.txt: describe 'color' value type in the "Values" section
Documentation/config.txt: have a separate "Values" section
Documentation/config.txt: describe the structure first and then meaning
Documentation/config.txt: explain multi-valued variables once
Documentation/config.txt: avoid unnecessary negation
"git -C '' subcmd" refused to work in the current directory, unlike
"cd ''" which silently behaves as a no-op.
* kn/git-cd-to-empty:
git: treat "git -C '<path>'" as a no-op when <path> is empty
"git imap-send" learned to optionally talk with an IMAP server via
libcURL; because there is no other option when Git is built with
NO_OPENSSL option, use that codepath by default under such
configuration.
* km/imap-send-libcurl-options:
imap-send: use cURL automatically when NO_OPENSSL defined
Workarounds for certain build of GPG that triggered false breakage
in a test.
* mg/verify-commit:
t7510: do not fail when gpg warns about insecure memory
"git rebase -i" recently started to include the number of
commits in the insn sheet to be processed, but on a platform
that prepends leading whitespaces to "wc -l" output, the numbers
are shown with extra whitespaces that aren't necessary.
* es/rebase-i-count-todo:
rebase-interactive: re-word "item count" comment
rebase-interactive: suppress whitespace preceding item count
We did not parse username followed by literal IPv6 address in SSH
transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git
correctly.
* tb/connect-ipv6-parse-fix:
t5500: show user name and host in diag-url
t5601: add more test cases for IPV6
connect.c: allow ssh://user@[2001:db8::1]/repo.git
The split index extension uses ewah bitmaps to mark index entries as
deleted, instead of removing them from the index directly. This can
result in an on-disk index, in which entries of stage #0 and higher
stages appear, which are removed later when the index bases are merged.
15999d0 read_index_from(): catch out of order entries when reading an
index file introduces a check which checks if the entries are in order
after each index entry is read in do_read_index. This check may however
fail when a split index is read.
Fix this by moving checking the index after we know there is no split
index or after the split index bases are successfully merged instead.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Output from "git log --decorate" mentions HEAD when it points at a
tip of an branch differently from a detached HEAD.
This is a potentially backward-incompatible change.
* mg/log-decorate-HEAD:
log: decorate HEAD with branch name
"git log --decorate" did not reset colors correctly around the
branch names.
* jc/decorate-leaky-separator-color:
log --decorate: do not leak "commit" color into the next item
Documentation/config.txt: simplify boolean description in the syntax section
Documentation/config.txt: describe 'color' value type in the "Values" section
Documentation/config.txt: have a separate "Values" section
Documentation/config.txt: describe the structure first and then meaning
Documentation/config.txt: explain multi-valued variables once
Documentation/config.txt: avoid unnecessary negation
Workarounds for certain build of GPG that triggered false breakage
in a test.
* mg/verify-commit:
t7510: do not fail when gpg warns about insecure memory
"git imap-send" learned to optionally talk with an IMAP server via
libcURL; because there is no other option when Git is built with
NO_OPENSSL option, use that codepath by default under such
configuration.
* km/imap-send-libcurl-options:
imap-send: use cURL automatically when NO_OPENSSL defined
We now detect number of CPUs on older BSD-derived systems.
* km/bsd-sysctl:
thread-utils.c: detect CPU count on older BSD-like systems
configure: support HAVE_BSD_SYSCTL option
Portability fixes and workarounds for shell scripts have been added
to help BSD-derived systems.
* km/bsd-shells:
t5528: do not fail with FreeBSD shell
help.c: use SHELL_PATH instead of hard-coded "/bin/sh"
git-compat-util.h: move SHELL_PATH default into header
git-instaweb: use @SHELL_PATH@ instead of /bin/sh
git-instaweb: allow running in a working tree subdirectory
Code in "git daemon" to parse out and hold hostnames used in
request interpolation has been simplified.
* rs/daemon-hostname-in-strbuf:
daemon: deglobalize hostname information
daemon: use strbuf for hostname info
"git branch" on a detached HEAD always said "(detached from xyz)",
even when "git status" would report "detached at xyz". The HEAD is
actually at xyz and haven't been moved since it was detached in
such a case, but the user cannot read what the current value of
HEAD is when "detached from" is used.
* mg/detached-head-report:
branch: name detached HEAD analogous to status
wt-status: refactor detached HEAD analysis
"git -C '' subcmd" refused to work in the current directory, unlike
"cd ''" which silently behaves as a no-op.
* kn/git-cd-to-empty:
git: treat "git -C '<path>'" as a no-op when <path> is empty
The versionsort.prerelease configuration variable can be used to
specify that v1.0-pre1 comes before v1.0.
* nd/versioncmp-prereleases:
config.txt: update versioncmp.prereleaseSuffix
versionsort: support reorder prerelease suffixes
When we delete a ref, we have to rewrite the entire
packed-refs file. We take this opportunity to "curate" the
packed-refs file and drop any entries that are crufty or
broken.
Dropping broken entries (e.g., with bogus names, or ones
that point to missing objects) is actively a bad idea, as it
means that we lose any notion that the data was there in the
first place. Aside from the general hackiness that we might
lose any information about ref "foo" while deleting an
unrelated ref "bar", this may seriously hamper any attempts
by the user at recovering from the corruption in "foo".
They will lose the sha1 and name of "foo"; the exact pointer
may still be useful even if they recover missing objects
from a different copy of the repository. But worse, once the
ref is gone, there is no trace of the corruption. A
follow-up "git prune" may delete objects, even though it
would otherwise bail when seeing corruption.
We could just drop the "broken" bits from
curate_packed_refs, and continue to drop the "crufty" bits:
refs whose loose counterpart exists in the filesystem. This
is not wrong to do, and it does have the advantage that we
may write out a slightly smaller packed-refs file. But it
has two disadvantages:
1. It is a potential source of races or mistakes with
respect to these refs that are otherwise unrelated to
the operation. To my knowledge, there aren't any active
problems in this area, but it seems like an unnecessary
risk.
2. We have to spend time looking up the matching loose
refs for every item in the packed-refs file. If you
have a large number of packed refs that do not change,
that outweighs the benefit from writing out a smaller
packed-refs file (it doesn't get smaller, and you do a
bunch of directory traversal to find that out).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are repacking with "-ad", we will drop any unreachable
objects. Likewise, using "-Ad --unpack-unreachable=<time>"
will drop any old, unreachable objects. In these cases, we
want to make sure the reachability we compute with "--all"
is complete. We can do this by passing GIT_REF_PARANOIA=1 in
the environment to pack-objects.
Note that "-Ad" is safe already, because it only loosens
unreachable objects. It is up to "git prune" to avoid
deleting them.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prune should know about broken objects at the tips of refs,
so that we can feed them to our traversal rather than
ignoring them. It's better for us to abort the operation on
the broken object than it is to start deleting objects with
an incomplete view of the reachability namespace.
Note that for missing objects, aborting is the best we can
do. For a badly-named ref, we technically could use its sha1
as a reachability tip. However, the iteration code just
feeds us a null sha1, so there would be a reasonable amount
of code involved to pass down our wishes. It's not really
worth trying to do better, because this is a case that
should happen extremely rarely, and the message we provide:
fatal: unable to parse object: refs/heads/bogus:name
is probably enough to point the user in the right direction.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most operations that iterate over refs are happy to ignore
broken cruft. However, some operations should be performed
with knowledge of these broken refs, because it is better
for the operation to choke on a missing object than it is to
silently pretend that the ref did not exist (e.g., if we are
computing the set of reachable tips in order to prune
objects).
These processes could just call for_each_rawref, except that
ref iteration is often hidden behind other interfaces. For
instance, for a destructive "repack -ad", we would have to
inform "pack-objects" that we are destructive, and then it
would in turn have to tell the revision code that our
"--all" should include broken refs.
It's much simpler to just set a global for "dangerous"
operations that includes broken refs in all iterations.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are doing a destructive operation like "git prune",
we want to be extra careful that the set of reachable tips
we compute is valid. If there is any corruption or oddity,
we are better off aborting the operation and letting the
user figure things out rather than plowing ahead and
possibly deleting some data that cannot be recovered.
The tests here include:
1. Pruning objects mentioned only be refs with invalid
names. This used to abort prior to d0f810f (refs.c:
allow listing and deleting badly named refs,
2014-09-03), but since then we silently ignore the tip.
Likewise, we test repacking that can drop objects
(either "-ad", which drops anything unreachable,
or "-Ad --unpack-unreachable=<time>", which tries to
optimize out a loose object write that would be
directly pruned).
2. Pruning objects when some refs point to missing
objects. We don't know whether any dangling objects
would have been reachable from the missing objects. We
are better to keep them around, as they are better than
nothing for helping the user recover history.
3. Packed refs that point to missing objects can sometimes
be dropped. By itself, this is more of an annoyance
(you do not have the object anyway; even if you can
recover it from elsewhere, all you are losing is a
placeholder for your state at the time of corruption).
But coupled with (2), if we drop the ref and then go
on to prune, we may lose unrecoverable objects.
Note that we use test_might_fail for some of the operations.
In some cases, it would be appropriate to abort the
operation, and in others, it might be acceptable to continue
but taking the information into account. The tests don't
care either way, and check only for data loss.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>