Commit Graph

102 Commits

Author SHA1 Message Date
James McCoy
a5a4b3ff4d filter-branch: remove multi-line headers in msg filter
df062010 (filter-branch: avoid passing commit message through sed)
introduced a regression when filtering commits with multi-line headers,
if the header contains a blank line.  An example of this is a gpg-signed
commit:

  $ git cat-file commit signed-commit
  tree 3d4038e029712da9fc59a72afbfcc90418451630
  parent 110eac945dc1713b27bdf49e74e5805db66971f0
  author A U Thor <author@example.com> 1112912413 -0700
  committer C O Mitter <committer@example.com> 1112912413 -0700
  gpgsig -----BEGIN PGP SIGNATURE-----
   Version: GnuPG v1

   iEYEABECAAYFAlYXADwACgkQE7b1Hs3eQw23CACgldB/InRyDgQwyiFyMMm3zFpj
   pUsAnA+f3aMUsd9mNroloSmlOgL6jIMO
   =0Hgm
   -----END PGP SIGNATURE-----

  Adding gpg

As a consequence, "filter-branch --msg-filter cat" (which should leave the
commit message unchanged) spills the signature (after the internal blank
line) into the original commit message.

The reason is that although the signature is indented, making the line a
whitespace only line, the "read" call is splitting the line based on
the shell's IFS, which defaults to <space><tab><newline>.  The leading
space is consumed and $header_line is empty, causing the "skip header
lines" loop to exit.

The rest of the commit object is then re-used as the rewritten commit
message, causing the new message to include the signature of the
original commit.

Set IFS to an empty string for the "read" call, thus disabling the word
splitting, which causes $header_line to be set to the non-empty value ' '.
This allows the loop to fully consume the header lines before
emitting the original, intact commit message.

[jc: this is literally based on MJG's suggestion]

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: James McCoy <vega.james@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-12 11:23:19 -07:00
Jeff King
df0620108b filter-branch: avoid passing commit message through sed
On some systems (like OS X), if sed encounters input without
a trailing newline, it will silently add it. As a result,
"git filter-branch" on such systems may silently rewrite
commit messages that omit a trailing newline. Even though
this is not something we generate ourselves with "git
commit", it's better for filter-branch to preserve the
original data as closely as possible.

We're using sed here only to strip the header fields from
the commit object. We can accomplish the same thing with a
shell loop. Since shell "read" calls are slow (usually one
syscall per byte), we use "cat" once we've skipped past the
header. Depending on the size of your commit messages, this
is probably faster (you pay the cost to fork, but then read
the data in saner-sized chunks). This idea is shamelessly
stolen from Junio.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-29 10:01:04 -07:00
Charles Bailey
79bc4ef368 filter-branch: eliminate duplicate mapped parents
When multiple parents of a merge commit get mapped to the same
commit, filter-branch used to pass all instances of the parent
commit to the parent and commit filters and to "git commit-tree" or
"git_commit_non_empty_tree".

This can often happen when extracting a small project from a large
repository; merges can join history with no commits on any branch
which affect the paths being retained.  Once the intermediate
commits have been filtered out, all the immediate parents of the
merge commit can end up being mapped to the same commit - either the
original merge-base or an ancestor of it.

"git commit-tree" would display an error but write the commit with
the normalized parents in any case.  "git_commit_non_empty_tree"
would fail to notice that the commit being made was in fact a
non-merge commit and would retain it even if a further pass with
"--prune-empty" would discard the commit as empty.

Ensure that duplicate parents are pruned before the parent filter to
make "--prune-empty" idempotent, removing all empty non-merge
commits in a singe pass.

Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-01 08:30:41 -07:00
Junio C Hamano
f52752d36a Merge branch 'lc/filter-branch-too-many-refs'
"git filter-branch" in a repository with many refs blew limit of
command line length.

* lc/filter-branch-too-many-refs:
  Allow git-filter-branch to process large repositories with lots of branches.
2013-10-17 15:55:12 -07:00
Lee Carver
3361a548db Allow git-filter-branch to process large repositories with lots of branches.
A recommended way to move trees between repositories is to use
git-filter-branch to revise the history for a single tree:

However, this can lead to "argument list too long" errors when the
original repository has many retained branches (>6k)

    /usr/local/git/libexec/git-core/git-filter-branch: line 270:
    /usr/local/git/libexec/git-core/git: Argument list too long
    Could not get the commits

Saving the output from rev-parse and feeding it into rev-list from
its standard input avoids this problem, since the rev-parse output
is not processed as a command line argument.

Signed-off-by: Lee Carver <Lee.Carver@servicenow.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-12 11:00:51 -07:00
Jeff King
83bd7437ca write_index: optionally allow broken null sha1s
Commit 4337b58 (do not write null sha1s to on-disk index,
2012-07-28) added a safety check preventing git from writing
null sha1s into the index. The intent was to catch errors in
other parts of the code that might let such an entry slip
into the index (or worse, a tree).

Some existing repositories may have invalid trees that
contain null sha1s already, though.  Until 4337b58, a common
way to clean this up would be to use git-filter-branch's
index-filter to repair such broken entries.  That now fails
when filter-branch tries to write out the index.

Introduce a GIT_ALLOW_NULL_SHA1 environment variable to
relax this check and make it easier to recover from such a
history.

It is tempting to not involve filter-branch in this commit
at all, and instead require the user to manually invoke

	GIT_ALLOW_NULL_SHA1=1 git filter-branch ...

to perform an index-filter on a history with trees with null
sha1s.  That would be slightly safer, but requires some
specialized knowledge from the user.  So let's set the
GIT_ALLOW_NULL_SHA1 variable automatically when checking out
the to-be-filtered trees.  Advice on using filter-branch to
remove such entries already exists on places like
stackoverflow, and this patch makes it Just Work again on
recent versions of git.

Further commands that touch the index will still notice and
fail, unless they actually remove the broken entries.  A
filter-branch whose filters do not touch the index at all
will not error out (since we complain of the null sha1 only
on writing, not when making a tree out of the index), but
this is acceptable, as we still print a loud warning, so the
problem is unlikely to go unnoticed.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 20:54:43 -07:00
Junio C Hamano
9a11f13d9e Merge branch 'jk/filter-branch-come-back-to-original'
When used with "-d temporary-directory" option, "git filter-branch"
failed to come back to the original working tree to perform the
final clean-up procedure.

* jk/filter-branch-come-back-to-original:
  filter-branch: return to original dir after filtering
2013-04-07 14:29:34 -07:00
Jeff King
97276019bb filter-branch: return to original dir after filtering
The first thing filter-branch does is to create a temporary
directory, either ".git-rewrite" in the current directory
(which may be the working tree or the repository if bare),
or in a directory specified by "-d". We then chdir to
$tempdir/t as our temporary working directory in which to run
tree filters.

After finishing the filter, we then attempt to go back to
the original directory with "cd ../..". This works in the
.git-rewrite case, but if "-d" is used, we end up in a
random directory. The only thing we do after this chdir is
to run git-read-tree, but that means that:

  1. The working directory is not updated to reflect the
     filtered history.

  2. We dump random files into "$tempdir/.." (e.g., if you
     use "-d /tmp/foo", we dump junk into /tmp).

Fix it by recording the full path to the original directory
and returning there explicitly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-02 13:34:55 -07:00
Jeff King
3c730fab2c filter-branch: use git-sh-setup's ident parsing functions
This saves us some code, but it also reduces the number of
processes we start for each filtered commit. Since we can
parse both author and committer in the same sed invocation,
we save one process. And since the new interface avoids tr,
we save 4 processes.

It also avoids using "tr", which has had some odd
portability problems reported with from Solaris's xpg6
version.

We also tweak one of the tests in t7003 to double-check that
we are properly exporting the variables (because test-lib.sh
exports GIT_AUTHOR_NAME, it will be automatically exported
in subprograms. We override this to make sure that
filter-branch handles it properly itself).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-18 15:43:49 -07:00
Junio C Hamano
9a0231b395 Merge branch 'jc/maint-filter-branch-epoch-date'
In 1.7.9 era, we taught "git rebase" about the raw timestamp format
but we did not teach the same trick to "filter-branch", which rolled
a similar logic on its own.  Because of this, "filter-branch" failed
to rewrite commits with ancient timestamps.

* jc/maint-filter-branch-epoch-date:
  t7003: add test to filter a branch with a commit at epoch
  date.c: Fix off by one error in object-header date parsing
  filter-branch: do not forget the '@' prefix to force git-timestamp
2012-07-22 12:55:05 -07:00
Junio C Hamano
cb102b0832 filter-branch: do not forget the '@' prefix to force git-timestamp
For some reason, this script reinvents, instead of refactoring the
existing one in git-sh-setup, the logic to grab ident information
from an existing commit; it was missed when the corresponding logic
in git-sh-setup was updated with 2c733fb (parse_date(): '@' prefix
forces git-timestamp, 2012-02-02).

Teach the script that it is OK to have a way ancient timestamp in
the commits that are being filtered.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 20:42:54 -07:00
Junio C Hamano
9c14001650 Merge branch 'jk/filter-branch-require-clean-work-tree'
* jk/filter-branch-require-clean-work-tree:
  filter-branch: use require_clean_work_tree
2011-10-05 12:35:55 -07:00
Jeff King
5347a50fec filter-branch: use require_clean_work_tree
Filter-branch already requires that we have a clean work
tree before starting. However, it failed to refresh the
index before checking, which means it could be wrong in the
case of stat-dirtiness.

Instead of simply adding a call to refresh the index, let's
switch to using the require_clean_work_tree function
provided by git-sh-setup. It does exactly what we want, and
with fewer lines of code and more specific output messages.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-15 16:58:55 -07:00
Junio C Hamano
1461205880 Merge branch 'js/sh-style'
* js/sh-style:
  filter-branch.sh: de-dent usage string
  misc-sh: fix up whitespace in some other .sh files.
2011-08-17 17:35:50 -07:00
Michael Witten
0906f6e14e filter-branch: Export variable `workdir' for --commit-filter
According to `git help filter-branch':

       --commit-filter <command>
           ...
           You can use the _map_ convenience function in this filter,
           and other convenience functions, too...
           ...

However, it turns out that `map' hasn't been usable because it depends
on the variable `workdir', which is not propogated to the environment
of the shell that runs the commit-filter <command> because the
shell is created via a simple-command rather than a compound-command
subshell:

 @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
                 $(git write-tree) $parentstr < ../message > ../map/$commit ||
                         die "could not write rewritten commit"

One solution is simply to export `workdir'. However, it seems rather
heavy-handed to export `workdir' to the environments of all commands,
so instead this commit exports `workdir' for only the duration of the
shell command in question:

 workdir=$workdir @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
                 $(git write-tree) $parentstr < ../message > ../map/$commit ||
                         die "could not write rewritten commit"

Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08 12:09:38 -07:00
Junio C Hamano
9dcca58db4 filter-branch.sh: de-dent usage string
"Usage: git filter-branch " that is prefixed to the first line is 25
columns long, so the "[--index-filter ..." on the second line would not
align with "[--env-filter ..." on the first line to begin with. If the
second and subsequent lines do not aim to align with anything on the
first line, it is just fine to indent them with a single HT.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-05 15:06:21 -07:00
Jon Seymour
285c6cbf3c misc-sh: fix up whitespace in some other .sh files.
I found that the patched 4 files were different when this
filter is applied.

	expand -i | unexpand --first-only

This patch contains the corrected files.

Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-05 15:04:48 -07:00
Csaba Henk
7ec344d802 filter-branch: retire --remap-to-ancestor
We can be clever and know by ourselves when we need the behavior
implied by "--remap-to-ancestor". No need to encumber users by having
them exposed to it as a tunable. (Option kept for backward compatibility,
but it's now a no-op.)

Signed-off-by: Csaba Henk <csaba@gluster.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-27 16:47:01 -07:00
Junio C Hamano
618d18b5aa Merge branch 'maint'
* maint:
  filter-branch: Fix error message for --prune-empty --commit-filter
2010-02-11 23:06:32 -08:00
Jacob Helwig
5da8171370 filter-branch: Fix error message for --prune-empty --commit-filter
Running filter-branch with --prune-empty and --commit-filter reports:

  "Cannot set --prune-empty and --filter-commit at the same time".

Change it to use the correct option name: --commit-filter

Signed-off-by: Jacob Helwig <jacob.helwig@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-11 22:12:36 -08:00
Junio C Hamano
4b7acc186f Merge branch 'ms/filter-branch-submodule'
* ms/filter-branch-submodule:
  filter-branch: Add tests for submodules in tree-filter
  filter-branch: Fix to allow replacing submodules with another content
2010-02-02 21:48:34 -08:00
Michal Sojka
03ca839537 filter-branch: Fix to allow replacing submodules with another content
When git filter-branch is used to replace a submodule with another
content, it always fails on the first commit.

Consider a repository with submod directory containing a submodule.  The
following command to remove the submodule and replace it with a file fails:

    git filter-branch --tree-filter 'rm -rf submod &&
                                     git rm -q submod &&
                                     mkdir submod &&
                                     touch submod/file'

with an error:

    error: submod: is a directory - add files inside instead

The reason is that git diff-index, which generates the first part of the
list of files updated by the tree filter, emits also the removed submodule
even if it was replaced by a real directory.

Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-28 13:49:53 -08:00
Stephen Boyd
9524cf2993 fix portability issues with $ in double quotes
Using a dollar sign in double quotes isn't portable. Escape them with
a backslash or replace the double quotes with single quotes.

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-26 15:16:54 -08:00
Junio C Hamano
f012d27ff3 Merge branch 'js/filter-branch-prime'
* js/filter-branch-prime:
  filter-branch: remove an unnecessary use of 'git read-tree'
2010-01-07 15:40:30 -08:00
Johannes Sixt
7ee6376938 filter-branch: remove an unnecessary use of 'git read-tree'
The intent of this particular call to 'git read-tree' was to fill an
index. But in fact, it only allocated an empty index. Later in the
program, the index is filled anyway by calling read-tree with specific
commits, and considering that elsewhere the index is even removed (i.e.,
it is not relied upon that the index file exists), this first call of
read-tree is completely redundant.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-15 16:20:23 -08:00
Junio C Hamano
ad7ace714d Merge branch 'rs/work-around-grep-opt-insanity'
* rs/work-around-grep-opt-insanity:
  Protect scripted Porcelains from GREP_OPTIONS insanity
  mergetool--lib: simplify guess_merge_tool()

Conflicts:
	git-instaweb.sh
2009-11-25 11:45:07 -08:00
Junio C Hamano
e1622bfcba Protect scripted Porcelains from GREP_OPTIONS insanity
If the user has exported the GREP_OPTIONS environment variable, the output
from "grep" and "egrep" in scripted Porcelains may be different from what
they expect.  For example, we may want to count number of matching lines,
by "grep" piped to "wc -l", and GREP_OPTIONS=-C3 will break such use.

The approach taken by this change to address this issue is to protect only
our own use of grep/egrep.  Because we do not unset it at the beginning of
our scripts, hook scripts run from the scripted Porcelains are exposed to
the same insanity this environment variable causes when grep/egrep is used
to implement logic (e.g. "grep | wc -l"), and it is entirely up to the
hook scripts to protect themselves.

On the other hand, applypatch-msg hook may want to show offending words in
the proposed commit log message using grep to the end user, and the user
might want to set GREP_OPTIONS=--color to paint the match more visibly.
The approach to protect only our own use without unsetting the environment
variable globally will allow this use case.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-23 16:31:07 -08:00
Thomas Rast
f2f3a6b802 filter-branch: nearest-ancestor rewriting outside subdir filter
Since a0e4639 (filter-branch: fix ref rewriting with
--subdirectory-filter, 2008-08-12) git-filter-branch has done
nearest-ancestor rewriting when using a --subdirectory-filter.

However, that rewriting strategy is also a useful building block in
other tasks.  For example, if you want to split out a subset of files
from your history, you would typically call

  git filter-branch -- <refs> -- <files>

But this fails for all refs that do not point directly to a commit
that affects <files>, because their referenced commit will not be
rewritten and the ref remains untouched.

The code was already there for the --subdirectory-filter case, so just
introduce an option that enables it independently.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-13 11:26:43 -08:00
Thomas Rast
2c1d2d8188 filter-branch: stop special-casing $filter_subdir argument
Handling $filter_subdir in the usual way requires a separate case at
every use, because the variable is empty when unused.

Furthermore, --subdirectory-filter supplies its own '--', and if the user
provided one himself, such as in

  git filter-branch --subdirectory-filter subdir -- --all -- subdir/file

	an extra '--' was used as path filter in the call to git-rev-list that
determines the commits that shall be rewritten.

To keep the argument handling sane, we filter $@ to contain only the
non-revision arguments, and store all revisions in $ref_args.  The
$ref_args are easy to handle since only the SHA1s are needed; the
actual branch names have already been stored in $tempdir/heads at this
point.

An extra separating -- is only required if the user did not provide
any non-revision arguments, as the latter disambiguate the
$filter_subdir following after them (or fail earlier because they are
ambiguous themselves).

Thanks to Johannes Sixt for suggesting this solution.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-13 11:26:43 -08:00
Matthieu Moy
83e355a62c filter-branch: make the usage string fit on 80 chars terminals.
It used to be a single, huge line, badly wrapped by xterm.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-18 13:35:55 -07:00
Dan Loewenherz
7bd93c1c62 Convert to use quiet option when available
A minor fix that eliminates usage of "2>/dev/null" when --quiet or
-q has already been implemented.

Signed-off-by: Dan Loewenherz <daniel.loewenherz@yale.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-22 19:05:37 -07:00
Elijah Newren
d5b0c97d13 git-filter-branch: avoid collisions with variables in eval'ed commands
Avoid using simple variable names like 'i', since user commands are eval'ed
and may clash with and overwrite our values.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-25 16:35:52 -07:00
John Tapsell
734cd5726c Improve error message for git-filter-branch
Tell the user that a backup (original) already exists, and how to solve
this problem (with -f option)

Signed-off-by: John Tapsell <johnflux@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-04 00:56:52 -08:00
Lars Noschinski
88e38808cd filter-branch -d: Export GIT_DIR earlier
The improved error handling catches a bug in filter-branch when using
-d pointing to a path outside any git repository:

$ git filter-branch -d /tmp/foo master
fatal: Not a git repository (or any of the parent directories): .git

This error message comes from git for-each-ref in line 224. GIT_DIR is
set correctly by git-sh-setup (to the foo.git repository), but not
exported (yet).

Signed-off-by: Lars Noschinski <lars@public.noschinski.de>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-18 11:15:17 -08:00
Eric Kidd
0ea29cce4d filter-branch: Add more error-handling
9273b56 (filter-branch: Fix fatal error on bare repositories, 2009-02-03)
fixed a missing check of return status from an underlying command in
git-filter-branch, but there still are places that do not check errors.
For example, the command does not pay attention to the exit status of the
command given by --commit-filter.  It should abort in such a case.

This attempts to fix all the remaining places that fails to checks errors.

In two places, I've had to break apart pipelines in order to check the
error code for the first stage of the pipeline, as discussed here:

  http://kerneltrap.org/mailarchive/git/2009/1/28/4835614

Feedback on this patch was provided by Johannes Sixt, Johannes Schindelin
and Junio C Hamano.  Thomas Rast helped with pipeline error handling.

Signed-off-by: Eric Kidd <git@randomhacks.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-11 18:26:52 -08:00
Junio C Hamano
df4364a429 Merge branch 'js/filter-branch-submodule'
* js/filter-branch-submodule:
  filter-branch: do not consider diverging submodules a 'dirty worktree'
  filter-branch: Fix fatal error on bare repositories
2009-02-07 11:09:48 -08:00
Johannes Schindelin
26be15f09d filter-branch: do not consider diverging submodules a 'dirty worktree'
At the end of filter-branch in a non-bare repository, the work tree is
updated with "read-tree -m -u HEAD", to carry the change forward in case
the current branch was rewritten.  In order to avoid losing any local
change during this step, filter-branch refuses to work when there are
local changes in the work tree.

This "read-tree -m -u HEAD" operation does not affect what commit is
checked out in a submodule (iow, it does not touch .git/HEAD in a
submodule checkout), and checking if there is any local change to the
submodule is not useful.

Staged submodules _are_ considered to be 'dirty', however,  as the
"read-tree -m -u HEAD" could result in loss of staged information
otherwise.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-05 17:48:04 -08:00
Eric Kidd
9273b56278 filter-branch: Fix fatal error on bare repositories
When git filter-branch is run on a bare repository, it prints out a fatal
error message:

  $ git filter-branch branch
  Rewrite 476c4839280c219c2317376b661d9d95c1727fc3 (9/9)
  WARNING: Ref 'refs/heads/branch' is unchanged
  fatal: This operation must be run in a work tree

Note that this fatal error message doesn't prevent git filter-branch from
exiting successfully. (Why doesn't git filter-branch actually exit with an
error when a shell command fails? I'm not sure why it was designed this
way.)

This error message is caused by the following section of code at the end of
git-filter-branch.sh:

  if [ "$(is_bare_repository)" = false ]; then
          unset GIT_DIR GIT_WORK_TREE GIT_INDEX_FILE
          test -z "$ORIG_GIT_DIR" || {
                  GIT_DIR="$ORIG_GIT_DIR" && export GIT_DIR
          }
          ... elided ...
          git read-tree -u -m HEAD
  fi

The problem is the call to $(is_bare_repository), which is made before
GIT_DIR and GIT_WORK_TREE are restored.  This call always returns "false",
even when we're running in a bare repository.  But this means that we will
attempt to call 'git read-tree' even in a bare repository, which will fail
and print an error.

This patch modifies git-filter-branch.sh to restore the original
environment variables before trying to call is_bare_repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-03 21:54:02 -08:00
Pierre Habouzit
d3240d935c filter-branch: add git_commit_non_empty_tree and --prune-empty.
git_commit_non_empty_tree is added to the functions that can be run from
commit filters. Its effect is to commit only commits actually touching the
tree and that are not merge points either.

The option --prune-empty is added. It defaults the commit-filter to
'git_commit_non_empty_tree "$@"', and can be used with any other
combination of filters, except --commit-hook that must used
'git_commit_non_empty_tree "$@"' where one puts 'git commit-tree "$@"'
usually to achieve the same result.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-10 17:35:58 -08:00
Miklos Vajna
d69da76dd0 filter-branch: use git rev-parse -q
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-03 14:27:17 -08:00
Junio C Hamano
01914577ed Merge branch 'tr/filter-branch'
* tr/filter-branch:
  revision --simplify-merges: make it a no-op without pathspec
  revision --simplify-merges: do not leave commits unprocessed
  revision --simplify-merges: use decoration instead of commit->util field
  Documentation: rev-list-options: move --simplify-merges documentation
  filter-branch: use --simplify-merges
  filter-branch: fix ref rewriting with --subdirectory-filter
  filter-branch: Extend test to show rewriting bug
  Topo-sort before --simplify-merges
  revision traversal: show full history with merge simplification
  revision.c: whitespace fix
2008-09-02 17:47:13 -07:00
Johannes Sixt
a9da1663df filter-branch: Grok special characters in tag names
The tag rewriting code used a 'sed' expression to substitute the new tag
name into the corresponding field of the annotated tag object. But this is
problematic if the tag name contains special characters. In particular,
if the tag name contained a slash, then the 'sed' expression had a syntax
error. We now protect against this by using 'printf' to assemble the
tag header.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-21 23:39:13 -07:00
Thomas Rast
f34a9416ab filter-branch: use --simplify-merges
Use rev-list --simplify-merges everywhere.  This changes the behaviour
of --subdirectory-filter in cases such as

  O -- A -\
   \       \
    \- B -- M

where A and B bring the same changes to the subdirectory: It now keeps
both sides of the merge.  Previously, the history would have been
simplified to 'O -- A'.  Merges of unrelated side histories that never
touch the subdirectory are still removed.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-12 17:27:46 -07:00
Thomas Rast
a0e46390d3 filter-branch: fix ref rewriting with --subdirectory-filter
The previous ancestor discovery code failed on any refs that are
(pre-rewrite) ancestors of commits marked for rewriting.  This means
that in a situation

   A -- B(topic) -- C(master)

where B is dropped by --subdirectory-filter pruning, the 'topic' was
not moved up to A as intended, but left unrewritten because we asked
about 'git rev-list ^master topic', which does not return anything.

Instead, we use the straightforward

   git rev-list -1 $ref -- $filter_subdir

to find the right ancestor.  To justify this, note that the nearest
ancestor is unique: We use the output of

  git rev-list --parents -- $filter_subdir

to rewrite commits in the first pass, before any ref rewriting.  If B
is a non-merge commit, the only candidate is its parent.  If it is a
merge, there are two cases:

- All sides of the merge bring the same subdirectory contents.  Then
  rev-list already pruned away the merge in favour for just one of its
  parents, so there is only one candidate.

- Some merge sides, or the merge outcome, differ.  Then the merge is
  not pruned and can be rewritten directly.

So it is always safe to use rev-list -1.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-12 17:27:17 -07:00
Thomas Rast
261044e85d filter-branch: be more helpful when an annotated tag changes
Previously, git-filter-branch failed if it attempted to update an
annotated tag.  Now we ignore this condition if --tag-name-filter is
given, so that we can later rewrite the tag.  If no such option was
provided, we warn the user that he might want to run with
"--tag-name-filter cat" to achieve the intended effect.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-08 16:19:51 -07:00
Petr Baudis
a4661b018d git-filter-branch.sh: Allow running in bare repositories
Commit 46eb449c restricted git-filter-branch to non-bare repositories
unnecessarily; git-filter-branch can work on bare repositories just
fine.

Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-23 16:57:04 -07:00
Junio C Hamano
b71ce7f3f1 Merge 1.5.5.3 in 2008-05-27 22:34:19 -07:00
Johannes Sixt
a17171b473 Revert "filter-branch: subdirectory filter needs --full-history"
This reverts commit cfabd6eee1. I had
implemented it without understanding what --full-history does. Consider
this history:

    C--M--N
   /  /  /
  A--B  /
   \   /
    D-/

where B and C modify a path, X, in the same way so that the result is
identical, and D does not modify it at all. With the path limiter X and
without --full-history this is simplified to

   A--B

i.e. only one of the paths via B or C is chosen. I had assumed that
--full-history would keep both paths like this

    C--M
   /  /
  A--B

removing the path via D; but in fact it keeps the entire history.

Currently, git does not have the capability to simplify to this
intermediary case. However, the other extreme to keep the entire history
is not wanted either in usual cases. I think we can expect that histories
like the above are rare, and in the usual cases we want a simplified
history. So let's remove --full-history again.

(Concerning t7003, subsequent tests depend on what the test case sets up,
so we can't just back out the entire test case.)

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-27 22:27:30 -07:00
Jeff King
0bdf93cbf0 filter-branch: fix variable export logic
filter-branch tries to restore "old" copies of some
environment variables by using the construct:

  unset var
  test -z "$old_var" || var="$old_var" && export var

This is just wrong.  AND-list and OR-list operators && and || have equal
precedence and they bind left to right.  The second term, var="$old"
assignment always succeeds, so we always end up exporting var.

On bash and dash, exporting an unset variable has no effect. However, on
some shells (such as FreeBSD's /bin/sh), the shell exports the empty
value.

This manifested itself in this case as git-filter-branch setting
GIT_INDEX_FILE to the empty string, which in turn caused its call to
git-read-tree to fail, leaving the working tree pointing at the original
HEAD instead of the rewritten one.

To fix this, we change the short-circuit logic to better match the intent:

  test -z "$old_var" || {
    var="$old_var" && export var
  }

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-13 21:45:28 -07:00
Junio C Hamano
e9dd751866 Merge branch 'bc/filter-branch'
* bc/filter-branch:
  filter-branch.sh: support nearly proper tag name filtering
2008-05-05 19:16:20 -07:00