Starting from big objects and going backwards means that we end up
picking a delta that goes from a bigger object to a smaller one. That's
advantageous for two reasons: the bigger object is likely the newer one
(since things tend to grow, rather than shrink), and doing a delete
tends to be smaller than doing an add.
So the deltas don't tend to be top-of-tree, and the packed end result is
just slightly smaller.
This will scan 2 or more object repositories and look for common objects, check
if they are hardlinked, and replace one with a hardlink to the other if not.
This version warns when skipping files because of size differences, and
handle more than 2 repositories automatically.
Signed-off-by: Ryan Anderson <ryan@michonline.com>
Cheered-on-by: Jeff Garzik <jgarzik@pobox.com>
Acked-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If you have two lists of heads, and you want to see ones reachable from
list $a but not from list $b, just do
git-rev-list $(git-rev-parse $a --not $b)
which is useful for both bisecting (where "b" would be the list of known
good revisions, and "a" would be the latest found bad head) and for just
seeing what the difference between two sets of heads are if you want to
generate a pack-file for the difference.
This actually successfully packed and unpacked a git archive down to
1.3MB (17MB unpacked).
Right now unpacking is way too noisy, lots of debug messages left.
This finishes the initial round of git-pack-object /
git-unpack-object pair. They are now good enough to be used as
a transport medium:
- Fix delta direction in pack-objects; the original was
computing delta to create the base object from the object to
be squashed, which was quite unfriendly for unpacker ;-).
- Add a script to test the very basics.
- Implement unpacker for both regular and deltified objects.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A zero disables delta generation (like before), but we make the window
be one bigger than specified, since we use one entry for the one to be
tested (it used to be that "--window=1" was meaningless, since we'd have
used up the single-entry window with the entry to be tested, and had no
chance of actually ever finding a delta).
The default window remains at 10, but now it really means "test the 10
closest objects", not "test the 9 closest objects".
Anything that generates a delta to see if two objects are close usually
isn't interested in the delta ends up being bigger than some specified
size, and this allows us to stop delta generation early when that
happens.
When Junio fixed the lack of a successful error code from try_delta(),
that uncovered an off-by-one error in the caller.
Also, some testing made it clear that we now find a lot more deltas,
because we used to (incorrectly) break early on bogus "failure"
cases.
Return value of try_delta is checked for negativeness, but the
success path does not return anything, letting compiler warn and
presumably return garbage.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Describe what to implement in fetch() and fetch_ref() for
pull backend writers a bit better.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
An earlier change to optimize directory-file conflict check
broke what "read-tree --emu23" expects. This is fixed by this
commit.
(1) Introduces an explicit flag to tell add_cache_entry() not to
check for conflicts and use it when reading an existing tree
into an empty stage --- by definition this case can never
introduce such conflicts.
(2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name()
aware of the cache stages, and flag conflict only with paths
in the same stage.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a merge adds a file DF and removes a directory there by
deleting a path DF/DF, git-merge-one-file-script can be called
for the removal of DF/DF when the path DF is already created by
"git-read-tree -m -u". When this happens, we get confused by a
failure return from 'rm -f -- "$4"' (where $4 is DF/DF); finding
file DF there the "rm -f" command complains that DF is not a
directory.
What we want to ensure is that there is no file DF/DF in this
case. Avoid getting ourselves confused by first checking if
there is a file, and only then try to remove it (and check for
failure from the "rm" command).
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds more tests for --emu23. One is to show how it can
carry forward more local changes than the straightforward
two-way fast forward, and another is to show the recent
overeager optimization of directory/file conflict check broke
things, which will be fixed in the next commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Using git-cherry, forward port local commits missing from the
new upstream head. This also depends on "-m" flag support in
git-commit-script.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The git-cherry command helps the git-rebase script by finding
commits that have not been merged upstream. Commits already
included in upstream are prefixed with '-' (meaning "drop from
my local pull"), while commits missing from upstream are
prefixed with '+' (meaning "add to the updated upstream").
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With -m flag specified, git-commit-script takes the commit
message along with author information from an existing commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Usually all of the match_xxx routines in date.c fill tm
structure assuming that the parsed string talks about local
time, and parse_date routine compensates for it by adjusting the
value with tz offset parsed out separately. However, this logic
does not work well when we feed GIT raw commit timestamp to it,
because what match_digits gets is already in GMT.
A good testcase is:
$ make test-date
$ ./test-date 'Fri Jun 24 16:55:27 2005 -0700' '1119657327 -0700'
These two timestamps represent the same time, but the second one
without the fix this commit introduces gives you 7 hours off.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
So far it just reads the header and generates the list of objects.
It also sorts them by the order they are written in the pack file,
since that ends up being the same order we got them originally, and
is thus "most recent first".
This is kind of like a tar-ball for a set of objects, ready to be
shipped off to another end. Alternatively, you could use is as a packed
representation of the object database directly, if you changed
"read_sha1_file()" to read these kinds of packs.
The latter is partiularly useful to generate a "packed history", ie you
could pack up your old history efficiently, but still have it available
(at a performance hit, of course).
I haven't actually written an unpacker yet, so the end result has not
been verified in any way yet. I obviously always write bug-free code,
so it just has to work, no?
git-write-tree failed when referenced objects only exist in the
GIT_ALTERNATE_OBJECT_DIRECTORIES path.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Acked-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When you do
git-rev-list --objects $(git-rev-parse HEAD^..HEAD)
it now lists not only the "commit difference" between the parent of HEAD
and HEAD itself (which is normally just the parent, but in the case of a
merge will be all the newly merged commits), but also all the new tree
and blob objects that weren't in the original.
NOTE! It doesn't walk all the way to the root, so it doesn't do a full
object search in the full old history. Instead, it will only look as
far back in the history as it needs to resolve the commits. Thus, if
the commit reverts a blob (or tree) back to a state much further back in
history, we may end up listing some blobs (or trees) as "new" even
though they exist further back.
Regardless, the list of objects will be a superset (usually exact) list
of objects needed to go from the beginning commit to ending commit.
As a particularly obvious special case,
git-rev-list --objects HEAD
will end up listing every single object that is reachable from the HEAD
commit.
Side note: the objects are sorted by "recency", with commits first.
Right-click on a context row now brings up a menu allowing the user to
generate a diff between that row and the selected row. Left-click on
a graph line shows the parent and children connected by the line in
the details pane. Left-click on a circle in the graph selects that
commit. Left-click elsewhere in the graph does nothing.
When displaying a diff, the bottom-right file list box behaves
slightly differently now; instead of eliding all other files' diffs,
it now just scrolls the details pane so that the selected file's diff
starts at the top of the pane.
Since the diffs can be rather large, arrange for an update to be done
every 100ms while reading diffs.
Also removed the CVS revision keywords and bumped the version number
to 1.2.
Output default revisions as their hex SHA1 names to be consistent.
Add "--verify" flag that verifies that we output a single ref and not
more (and disables ref arguments).
A "patch ID" is nothing but a SHA1 of the diff associated with a patch,
with whitespace and line numbers ignored. As such, it's "reasonably
stable", but at the same time also reasonably unique, ie two patches
that have the same "patch ID" are almost guaranteed to be the same
thing.
IOW, you can use this thing to look for likely duplicate commits.
This way we don't get it in the commit message, even if the patch had
been generated by cogito (or CVS, ugh) and people didn't add the proper
"---" marker.
..and git-apply does a lot better job at it anyway.
Also, we break the comment/diff on a line that starts with "diff -", not
just on the "---" line. Especially for git diffs, we actually want that
line in the diff.
(We should probably also break on "Index: ..." followed by "=====")
This patch addresses the problem reported by Paul Mackerras such that --merge-order
did not report the last root of a graph with merge of two independent roots.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
git-rev-list --merge-order is omitting one of the roots when
displaying a merge containing two distinct roots.
A subsequent patch will fix the problem.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We used to ignore unreachable tags, which just causes problems: it makes
"git prune" leave them around, but since we'll have prune everything
that tag points to, the tag object really should be removed too.
So remove the code that made us think tags were always reachable.
The sensible cleanup of the in-memory storage order of commit parents broke the --merge-order
code which was dependent on the previous behaviour of parse_commit().
This patch restores the correctness --merge-order behaviour by taking account of the
new behaviour of parse_commit.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
gitk is really quite incredibly cool, and is great for visualizing what
is going on in a git repository. It's especially useful when you are
looking at what has changed since a particular version, since it
gracefully handles partial trees (and this also avoids the expense of
looking at _all_ changes in a big project).
For example, to see what changed in a merge after a "git pull", do
gitk ORIG_HEAD..
to see only the new things. Or you can simply do "gitk v2.6.12.." to
see what has changed since the v2.6.12 tag etc.
This merge itself is pretty interesting too, since it shows off a
feature of git itself that is incredibly cool: you can merge a
_separate_ git project into another git project. Not only does this
keep all the history of the original project, it also makes it possible
to continue to merge with the original project and the union of the two
projects.
I don't think anybody else can do that.
This adds tests (which also serves demonstration) for the --stat
and --summary flags to the git-apply command.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Typical expected usage is "git-apply --stat --summary" to show
diffstat plus dense description of information available in git
extended headers, such as creations, renames, and mode changes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a patch is a git extended rename/copy patch, "git-apply
--stat" showed the old filename. Change it to show the new
filename, because most of the time we are interested in looking
at the resulting tree.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>