Since a0e4639 (filter-branch: fix ref rewriting with
--subdirectory-filter, 2008-08-12) git-filter-branch has done
nearest-ancestor rewriting when using a --subdirectory-filter.
However, that rewriting strategy is also a useful building block in
other tasks. For example, if you want to split out a subset of files
from your history, you would typically call
git filter-branch -- <refs> -- <files>
But this fails for all refs that do not point directly to a commit
that affects <files>, because their referenced commit will not be
rewritten and the ref remains untouched.
The code was already there for the --subdirectory-filter case, so just
introduce an option that enables it independently.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The improved error handling catches a bug in filter-branch when using
-d pointing to a path outside any git repository:
$ git filter-branch -d /tmp/foo master
fatal: Not a git repository (or any of the parent directories): .git
This error message comes from git for-each-ref in line 224. GIT_DIR is
set correctly by git-sh-setup (to the foo.git repository), but not
exported (yet).
Signed-off-by: Lars Noschinski <lars@public.noschinski.de>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9273b56 (filter-branch: Fix fatal error on bare repositories, 2009-02-03)
fixed a missing check of return status from an underlying command in
git-filter-branch, but there still are places that do not check errors.
For example, the command does not pay attention to the exit status of the
command given by --commit-filter. It should abort in such a case.
This attempts to fix all the remaining places that fails to checks errors.
In two places, I've had to break apart pipelines in order to check the
error code for the first stage of the pipeline, as discussed here:
http://kerneltrap.org/mailarchive/git/2009/1/28/4835614
Feedback on this patch was provided by Johannes Sixt, Johannes Schindelin
and Junio C Hamano. Thomas Rast helped with pipeline error handling.
Signed-off-by: Eric Kidd <git@randomhacks.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/filter-branch-submodule:
filter-branch: do not consider diverging submodules a 'dirty worktree'
filter-branch: Fix fatal error on bare repositories
When git filter-branch is run on a bare repository, it prints out a fatal
error message:
$ git filter-branch branch
Rewrite 476c4839280c219c2317376b661d9d95c1727fc3 (9/9)
WARNING: Ref 'refs/heads/branch' is unchanged
fatal: This operation must be run in a work tree
Note that this fatal error message doesn't prevent git filter-branch from
exiting successfully. (Why doesn't git filter-branch actually exit with an
error when a shell command fails? I'm not sure why it was designed this
way.)
This error message is caused by the following section of code at the end of
git-filter-branch.sh:
if [ "$(is_bare_repository)" = false ]; then
unset GIT_DIR GIT_WORK_TREE GIT_INDEX_FILE
test -z "$ORIG_GIT_DIR" || {
GIT_DIR="$ORIG_GIT_DIR" && export GIT_DIR
}
... elided ...
git read-tree -u -m HEAD
fi
The problem is the call to $(is_bare_repository), which is made before
GIT_DIR and GIT_WORK_TREE are restored. This call always returns "false",
even when we're running in a bare repository. But this means that we will
attempt to call 'git read-tree' even in a bare repository, which will fail
and print an error.
This patch modifies git-filter-branch.sh to restore the original
environment variables before trying to call is_bare_repository.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_commit_non_empty_tree is added to the functions that can be run from
commit filters. Its effect is to commit only commits actually touching the
tree and that are not merge points either.
The option --prune-empty is added. It defaults the commit-filter to
'git_commit_non_empty_tree "$@"', and can be used with any other
combination of filters, except --commit-hook that must used
'git_commit_non_empty_tree "$@"' where one puts 'git commit-tree "$@"'
usually to achieve the same result.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/filter-branch:
revision --simplify-merges: make it a no-op without pathspec
revision --simplify-merges: do not leave commits unprocessed
revision --simplify-merges: use decoration instead of commit->util field
Documentation: rev-list-options: move --simplify-merges documentation
filter-branch: use --simplify-merges
filter-branch: fix ref rewriting with --subdirectory-filter
filter-branch: Extend test to show rewriting bug
Topo-sort before --simplify-merges
revision traversal: show full history with merge simplification
revision.c: whitespace fix
The tag rewriting code used a 'sed' expression to substitute the new tag
name into the corresponding field of the annotated tag object. But this is
problematic if the tag name contains special characters. In particular,
if the tag name contained a slash, then the 'sed' expression had a syntax
error. We now protect against this by using 'printf' to assemble the
tag header.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous ancestor discovery code failed on any refs that are
(pre-rewrite) ancestors of commits marked for rewriting. This means
that in a situation
A -- B(topic) -- C(master)
where B is dropped by --subdirectory-filter pruning, the 'topic' was
not moved up to A as intended, but left unrewritten because we asked
about 'git rev-list ^master topic', which does not return anything.
Instead, we use the straightforward
git rev-list -1 $ref -- $filter_subdir
to find the right ancestor. To justify this, note that the nearest
ancestor is unique: We use the output of
git rev-list --parents -- $filter_subdir
to rewrite commits in the first pass, before any ref rewriting. If B
is a non-merge commit, the only candidate is its parent. If it is a
merge, there are two cases:
- All sides of the merge bring the same subdirectory contents. Then
rev-list already pruned away the merge in favour for just one of its
parents, so there is only one candidate.
- Some merge sides, or the merge outcome, differ. Then the merge is
not pruned and can be rewritten directly.
So it is always safe to use rev-list -1.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This extends the --subdirectory-filter test in t7003 to demonstrate a
rewriting bug: when rewriting two refs A and B such that B is an
ancestor of A, it fails to rewrite B.
The underlying issue is that the rev-list invocation at
git-filter-branch.sh:332 more or less boils down to
git rev-list B --boundary ^A
which outputs nothing because B is an ancestor of A.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 46eb449c restricted git-filter-branch to non-bare repositories
unnecessarily; git-filter-branch can work on bare repositories just
fine.
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit cfabd6eee1. I had
implemented it without understanding what --full-history does. Consider
this history:
C--M--N
/ / /
A--B /
\ /
D-/
where B and C modify a path, X, in the same way so that the result is
identical, and D does not modify it at all. With the path limiter X and
without --full-history this is simplified to
A--B
i.e. only one of the paths via B or C is chosen. I had assumed that
--full-history would keep both paths like this
C--M
/ /
A--B
removing the path via D; but in fact it keeps the entire history.
Currently, git does not have the capability to simplify to this
intermediary case. However, the other extreme to keep the entire history
is not wanted either in usual cases. I think we can expect that histories
like the above are rare, and in the usual cases we want a simplified
history. So let's remove --full-history again.
(Concerning t7003, subsequent tests depend on what the test case sets up,
so we can't just back out the entire test case.)
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a general principle, we should not use "git diff" to validate the
results of what git command that is being tested has done. We would not
know if we are testing the command in question, or locating a bug in the
cute hack of "git diff --no-index".
Rather use test_cmp for that purpose.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* bd/tests:
Rename the test trash directory to contain spaces.
Fix tests breaking when checkout path contains shell metacharacters
Don't use the 'export NAME=value' in the test scripts.
lib-git-svn.sh: Fix quoting issues with paths containing shell metacharacters
test-lib.sh: Fix some missing path quoting
Use test_set_editor in t9001-send-email.sh
test-lib.sh: Add a test_set_editor function to safely set $VISUAL
git-send-email.perl: Handle shell metacharacters in $EDITOR properly
config.c: Escape backslashes in section names properly
git-rebase.sh: Fix --merge --abort failures when path contains whitespace
Conflicts:
t/t9115-git-svn-dcommit-funky-renames.sh
This fixes the remainder of the issues where the test script itself is at
fault for failing when the git checkout path contains whitespace or other
shell metacharacters.
The majority of git svn tests used the idiom
test_expect_success "title" "test script using $svnrepo"
These were changed to have the test script in single-quotes:
test_expect_success "title" 'test script using "$svnrepo"'
which unfortunately makes the patch appear larger than it really is.
One consequence of this change is that in the verbose test output the
value of $svnrepo (and in some cases other variables, too) is no
longer expanded, i.e. previously we saw
* expecting success:
test script using /path/to/git/t/trash/svnrepo
but now it is:
* expecting success:
test script using "$svnrepo"
Signed-off-by: Bryan Donlan <bdonlan@fushizen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d89c1df (filter-branch: don't use xargs -0, 2008-03-12) replaced a
'ls-files | xargs rm' pipeline by 'git clean'. 'git clean' however does
not recurse and remove directories by default.
Now, consider a tree-filter that renames a directory.
1. For the first commit everything works as expected
2. Then filter-branch checks out the files for the next commit. This
leaves the new directory behind because there is no real "branch
switching" involved that would notice that the directory can be
removed.
3. Then filter-branch invokes 'git clean' to remove exactly those
left-overs. But here it does not remove the directory.
4. The next tree-filter does not work as expected because there already
exists a directory with the new name.
Just add -d to 'git clean', so that empty directories are removed.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test currently fails.
If b is a directory then 'mv a b' is not a plain "rename", but really a
"move", so we must also test that the directory does not exist with the
old name in the directory with the new name.
There's also some cleanup in the corresponding "rename file" test to avoid
spurious shell syntax errors and "ambigous ref" error from 'git show' (but
these should show up only if the test would fail anyway). Plus we also
test for the non-existence of the old file.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Add support for creating a new tag object and retaining the tag message,
author, and date when rewriting tags. The gpg signature, if one exists,
will be stripped.
This adds nearly proper tag name filtering to filter-branch. Proper tag
name filtering would include the ability to change the tagger, tag date,
tag message, and _not_ strip a gpg signature if the tag did not change.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Specifying character ranges in tr differs between System V
and POSIX. In System V, brackets are required (e.g.,
'[A-Z]'), whereas in POSIX they are not.
We can mostly get around this by just using the bracket form
for both sets, as in:
tr '[A-Z] '[a-z]'
in which case POSIX interpets this as "'[' becomes '['",
which is OK.
However, this doesn't work with multiple sequences, like:
# rot13
tr '[A-Z][a-z]' '[N-Z][A-M][n-z][a-m]'
where the POSIX version does not behave the same as the
System V version. In this case, we must simply enumerate the
sequence.
This patch fixes problematic uses of tr in git scripts and
test scripts in one of three ways:
- if a single sequence, make sure it uses brackets
- if multiple sequences, enumerate
- if extra brackets (e.g., tr '[A]' 'a'), eliminate
brackets
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The subdirectory filter had a bug to notice that the commit in question
did not have anything in the path-limited part of the tree. $commit:$path
does not name an empty tree when $path does not appear in $commit.
This should fix it. The additional test in t7003 is originally from Kevin
Ballard but with fixups.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command used a very old fashioned construct to extract
filenames out of diff-index and ended up corrupting the output.
We can simply use --name-only and pipe into --stdin mode of
update-index. It's been like that for the past 2 years or so
since a94d994 (update-index: work with c-quoted name).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test passed for the wrong reason: If the script given to --msg-filter
fails, it is expected that git-filter-branch aborts. But the test forgot
to tell the branch name to rewrite, and so git-filter-branch failed due to
incorrect usage.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier, "git filter-branch --<options> HEAD" would not update the
working tree after rewriting the branch. This commit fixes it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
With this function, a commit filter can leave out unwanted commits
(such as temporary commits). It does _not_ undo the changeset
corresponding to that commit, but it _skips_ the revision. IOW
no tree object is changed by this.
If you like to commit early and often, but want to filter out all
intermediate commits, marked by "@@@" in the commit message, you can
now do this with
git filter-branch --commit-filter '
if git cat-file commit $GIT_COMMIT | grep '@@@' > /dev/null;
then
skip_commit "$@";
else
git commit-tree "$@";
fi' newbranch
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the convenience functions to the top of git-filter-branch.sh, and
return from the script when the environment variable SOURCE_FUNCTIONS is
set.
By sourcing git-filter-branch with that variable set automatically, all
commit filters may access the convenience functions like "map".
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to take the first non-option argument as the name for the new
branch. This syntax is not extensible to support rewriting more than just
HEAD.
Instead, we now have the following syntax:
git filter-branch [<filter options>...] [<rev-list options>]
All positive refs given in <rev-list options> are rewritten. Yes,
in-place. If a ref was changed, the original head is stored in
refs/original/$ref now, for your inspecting pleasure, in addition to the
reflogs (since it is easier to inspect "git show-ref | grep original" than
to inspect all the reflogs).
This commit also adds the --force option to remove .git-rewrite/ and all
refs from refs/original/ before filtering.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common mistake is to provide a filter which fails unwantedly. For
example, this will stop in the middle:
git filter-branch --env-filter '
test $GIT_COMMITTER_EMAIL = xyz &&
export GIT_COMMITTER_EMAIL = abc' rewritten
When $GIT_COMMITTER_EMAIL is not "xyz", the test fails, and consequently
the whole filter has a non-zero exit status. However, as demonstrated
in this example, filter-branch would just stop, and the user would be
none the wiser.
Also, a failing msg-filter would not have been caught, as was the
case with one of the tests.
This patch fixes both issues, by paying attention to the exit status
of msg-filter, and by saying what failed before exiting.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the man page, there is an example which describes how to remove
single commits (although it keeps the changes which were not reverted
in the next non-removed commit). Better make sure that it works as
expected.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is based on Jeff King's example in
20070621130137.GB4487@coredump.intra.peff.net
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When two branches are merged that modify a subdirectory (possibly in
different intermediate steps) such that both end up identical, then
rev-list chooses only one branch. But when we filter history, we want to
keep both branches. Therefore, we must use --full-history.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With git-filter-branch --subdirectory-filter <subdirectory> you can
get at the history, as seen by a certain subdirectory. The history
of the rewritten branch will only contain commits that touched that
subdirectory, and the subdirectory will be rewritten to be the new
project root.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subset of commits in a branch used to be specified by options (-k, -r)
as well as the branch tip itself (-s). It is more natural (for git users)
to specify revision ranges like 'master..next' instead. This makes it so.
If no range is specified it defaults to 'HEAD'.
As a consequence, the new name of the filtered branch must be the first
non-option argument. All remaining arguments are passed to 'git rev-list'
unmodified.
The tip of the branch that gets filtered is implied: It is the first
commit that git rev-list would print for the specified range.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option '-k' says that the given commit and _all_ of its ancestors
are kept as-is.
However, if a to-be-rewritten commit branched from an ancestor of an
ancestor of a commit given with '-k', filter-branch would fail.
Example:
A - B
\
C
If filter-branch was called with '-k B -s C', it would actually keep
B (and A as its parent), but would rewrite C, and its parent.
Noticed by Johannes Sixt.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This script is derived from Pasky's cg-admin-rewritehist.
In fact, it _is_ the same script, minimally adapted to work without cogito.
It _should_ be able to perform the same tasks, even if only relying on
core-git programs.
All the work is Pasky's, just the adaption is mine.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Hopefully-signed-off-by: Petr "cogito master" Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>