2005-05-20 23:00:23 +02:00
|
|
|
#!/bin/bash
|
|
|
|
|
2005-06-21 16:18:00 +02:00
|
|
|
# Example script to deltify an entire GIT repository based on the commit list.
|
2005-05-20 23:00:23 +02:00
|
|
|
# The most recent version of a file is the reference and previous versions
|
|
|
|
# are made delta against the best earlier version available. And so on for
|
[PATCH] mkdelta enhancements (take 2)
Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list, but was
only able to create a delta against the preceeding object. This patch
reworks the code to fix that limitation and hopefully makes it a bit
clearer than before, including fixing the delta loop detection which was
broken.
This means that
git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6
will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc. The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list. If
no limit is specified it is unlimited (note: it might run out of
memory with long object lists).
Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.
Finally the git-deltafy-script has been updated accordingly, and some
bugs fixed (thanks to Stephen C. Tweedie for spotting them).
This version has been toroughly tested and I think it is ready
for public consumption.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 03:52:19 +02:00
|
|
|
# successive versions going back in time. This way the increasing delta
|
|
|
|
# overhead is pushed towards older versions of any given file.
|
2005-05-20 23:00:23 +02:00
|
|
|
#
|
|
|
|
# The -d argument allows to provide a limit on the delta chain depth.
|
[PATCH] mkdelta enhancements (take 2)
Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list, but was
only able to create a delta against the preceeding object. This patch
reworks the code to fix that limitation and hopefully makes it a bit
clearer than before, including fixing the delta loop detection which was
broken.
This means that
git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6
will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc. The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list. If
no limit is specified it is unlimited (note: it might run out of
memory with long object lists).
Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.
Finally the git-deltafy-script has been updated accordingly, and some
bugs fixed (thanks to Stephen C. Tweedie for spotting them).
This version has been toroughly tested and I think it is ready
for public consumption.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 03:52:19 +02:00
|
|
|
# If 0 is passed then everything is undeltafied. Limiting the delta
|
|
|
|
# depth is meaningful for subsequent access performance to old revisions.
|
|
|
|
# A value of 16 might be a good compromize between performance and good
|
|
|
|
# space saving. Current default is unbounded.
|
|
|
|
#
|
|
|
|
# The --max-behind=30 argument is passed to git-mkdelta so to keep
|
|
|
|
# combinations and memory usage bounded a bit. If you have lots of memory
|
|
|
|
# and CPU power you may remove it (or set to 0) to let git-mkdelta find the
|
|
|
|
# best delta match regardless of the number of revisions for a given file.
|
|
|
|
# You can also make the value smaller to make it faster and less
|
|
|
|
# memory hungry. A value of 5 ought to still give pretty good results.
|
|
|
|
# When set to 0 or ommitted then look behind is unbounded. Note that
|
|
|
|
# git-mkdelta might die with a segmentation fault in that case if it
|
|
|
|
# runs out of memory. Note that the GIT repository will still be consistent
|
|
|
|
# even if git-mkdelta dies unexpectedly.
|
2005-05-20 23:00:23 +02:00
|
|
|
|
|
|
|
set -e
|
|
|
|
|
2005-06-21 16:18:00 +02:00
|
|
|
max_depth=
|
|
|
|
[ "$1" == "-d" ] && max_depth="--max-depth=$2" && shift 2
|
|
|
|
|
|
|
|
overlap=30
|
|
|
|
max_behind="--max-behind=$overlap"
|
2005-05-20 23:00:23 +02:00
|
|
|
|
[PATCH] mkdelta enhancements (take 2)
Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list, but was
only able to create a delta against the preceeding object. This patch
reworks the code to fix that limitation and hopefully makes it a bit
clearer than before, including fixing the delta loop detection which was
broken.
This means that
git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6
will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc. The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list. If
no limit is specified it is unlimited (note: it might run out of
memory with long object lists).
Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.
Finally the git-deltafy-script has been updated accordingly, and some
bugs fixed (thanks to Stephen C. Tweedie for spotting them).
This version has been toroughly tested and I think it is ready
for public consumption.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 03:52:19 +02:00
|
|
|
function process_list() {
|
|
|
|
if [ "$list" ]; then
|
|
|
|
echo "Processing $curr_file"
|
2005-06-21 16:18:00 +02:00
|
|
|
echo "$list" | xargs git-mkdelta $max_depth $max_behind -v
|
[PATCH] mkdelta enhancements (take 2)
Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list, but was
only able to create a delta against the preceeding object. This patch
reworks the code to fix that limitation and hopefully makes it a bit
clearer than before, including fixing the delta loop detection which was
broken.
This means that
git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6
will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc. The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list. If
no limit is specified it is unlimited (note: it might run out of
memory with long object lists).
Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.
Finally the git-deltafy-script has been updated accordingly, and some
bugs fixed (thanks to Stephen C. Tweedie for spotting them).
This version has been toroughly tested and I think it is ready
for public consumption.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 03:52:19 +02:00
|
|
|
fi
|
|
|
|
}
|
|
|
|
|
2005-06-21 16:18:00 +02:00
|
|
|
rev_list=""
|
2005-05-20 23:00:23 +02:00
|
|
|
curr_file=""
|
|
|
|
|
|
|
|
git-rev-list HEAD |
|
2005-06-21 16:18:00 +02:00
|
|
|
while true; do
|
|
|
|
# Let's batch revisions into groups of 1000 to give it a chance to
|
|
|
|
# scale with repositories containing long revision lists. We also
|
|
|
|
# overlap with the previous batch the size of mkdelta's look behind
|
|
|
|
# value in order to account for the processing discontinuity.
|
|
|
|
rev_list="$(echo -e -n "$rev_list" | tail --lines=$overlap)"
|
|
|
|
for i in $(seq 1000); do
|
|
|
|
read rev || break
|
|
|
|
rev_list="$rev_list$rev\n"
|
|
|
|
done
|
|
|
|
echo -e -n "$rev_list" |
|
|
|
|
git-diff-tree -r -t --stdin |
|
|
|
|
awk '/^:/ { if ($5 == "M") printf "%s %s\n%s %s\n", $4, $6, $3, $6 }' |
|
|
|
|
LC_ALL=C sort -s -k 2 | uniq |
|
|
|
|
while read sha1 file; do
|
|
|
|
if [ "$file" == "$curr_file" ]; then
|
|
|
|
list="$list $sha1"
|
|
|
|
else
|
|
|
|
process_list
|
|
|
|
curr_file="$file"
|
|
|
|
list="$sha1"
|
|
|
|
fi
|
|
|
|
done
|
|
|
|
[ "$rev" ] || break
|
2005-05-20 23:00:23 +02:00
|
|
|
done
|
[PATCH] mkdelta enhancements (take 2)
Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list, but was
only able to create a delta against the preceeding object. This patch
reworks the code to fix that limitation and hopefully makes it a bit
clearer than before, including fixing the delta loop detection which was
broken.
This means that
git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6
will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc. The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list. If
no limit is specified it is unlimited (note: it might run out of
memory with long object lists).
Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.
Finally the git-deltafy-script has been updated accordingly, and some
bugs fixed (thanks to Stephen C. Tweedie for spotting them).
This version has been toroughly tested and I think it is ready
for public consumption.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 03:52:19 +02:00
|
|
|
process_list
|
|
|
|
|
|
|
|
curr_file="root directory"
|
|
|
|
list="$(
|
|
|
|
git-rev-list HEAD |
|
|
|
|
while read commit; do
|
|
|
|
git-cat-file commit $commit |
|
|
|
|
sed -n 's/tree //p;Q'
|
|
|
|
done
|
|
|
|
)"
|
|
|
|
process_list
|
|
|
|
|