The output from "symmetric diff", i.e. A...B, does not
distinguish between commits that are reachable from A and the
ones that are reachable from B. In this picture, such a
symmetric diff includes commits marked with a and b.
x---b---b branch B
/ \ /
/ .
/ / \
o---x---a---a branch A
However, you cannot tell which ones are 'a' and which ones are
'b' from the output. Sometimes this is frustrating. This adds
an output option, --left-right, to rev-list.
rev-list --left-right A...B
would show ones reachable from A prefixed with '<' and the ones
reachable from B prefixed with '>'.
When combined with --boundary, boundary commits (the ones marked
with 'x' in the above picture) are shown with prefix '-', so you
would see list that looks like this:
git rev-list --left-right --boundary --pretty=oneline A...B
>bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb 3rd on b
>bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb 2nd on b
<aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 3rd on a
<aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 2nd on a
-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1st on b
-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1st on a
Signed-off-by: Junio C Hamano <junkio@cox.net>
I was running git show on various commits found by fsck-objects
when I found this bug. Since find_unique_abbrev() cannot find
an abbreviation for an object not in the database, it will
return NULL, which is bad to run strlen() on. So instead, we'll
just display the unabbreviated sha1 that we referenced in the
commit.
I'm not sure that this is the best 'fix' for it because the
commit I was trying to show was broken, but I don't think a
program should segfault even if the user tries to do something
stupid.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
I noticed that I was looking at the kernel gitweb output at some point
rather than just do "git log", simply because I liked seeing the
simplified date-format, ie the "5 days ago" rather than a full date.
This adds infrastructure to do that for "git log" too. It does NOT add the
actual flag to enable it, though, so right now this patch is a no-op, but
it should now be easy to add a command line flag (and possibly a config
file option) to just turn on the "relative" date format.
The exact cut-off points when it switches from days to weeks etc are
totally arbitrary, but are picked somewhat to avoid the "1 weeks ago"
thing (by making it show "10 days ago" rather than "1 week", or "70
minutes ago" rather than "1 hour ago").
[jc: with minor fix and tweak around "month" and "week" area.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introduces global inline:
hashcmp(const unsigned char *sha1, const unsigned char *sha2)
Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
[jc: I needed to hand merge the changes to the updated codebase,
so the result needs to be checked.]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
format-patch previously didn't generate a newline after a subject. This
caused the diffstat to not be displayed in messages with only one line
for the commit message.
This patch fixes this by adding a newline after the headers if a body
hasn't been added.
Signed-off-by: Robert Shearman <rob@codeweavers.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This updates the type-enumeration constants introduced to reduce
the memory footprint of "struct object" to match the type bits
already used in the packfile format, by removing the former
(i.e. TYPE_* constant macros) and using the latter (i.e. enum
object_type) throughout the code for consistency.
Eventually we can stop passing around the "type strings"
entirely, and this will help - no confusion about two different
integer enumeration.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the "contaminate the well even more" approach
taken in the current merge-base postprosessing code. Instead,
when there are more than one merge-base results, we compute the
merge-base between them and see if one is a fast-forward of the
other, in which case the ancestor is removed from the result.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix clear_commit_marks() enough to be usable in
get_merge_bases(), and retire now unused clear_object_marks().
Signed-off-by: Junio C Hamano <junkio@cox.net>
Earlier change broke "git describe A B" among other things.
Revert it for now, and clean the commits smudged by
get_merge_bases using clear_object_marks() function. For
complex commit ancestry graph, this is way cheaper as well.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Actually in this case we would have traversed a lot of commits, so cleaning
things up is even more important.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Change get_merge_bases() to be able to clean up after itself if
needed by adding a cleanup parameter.
We don't need to save the flags and restore them afterwards anymore;
that was a leftover from before the flags were moved out of the
range used in revision.c. clear_commit_marks() sets them to zero,
which is enough.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Don't care if objects have been parsed or not and don't stop when we
reach a commit that is already clean -- its parents could be dirty.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add get_merge_bases_clean(), a wrapper for get_merge_bases() that cleans
up after doing its work and make get_merge_bases() NOT clean up.
Single-shot programs like git-merge-base can use the dirty and fast
version.
Also move the object flags used in get_merge_bases() out of the range
defined in revision.h. This fixes the "66ae0c77...ced9456a
89719209...262a6ef7" test of the ... operator which is introduced with
the next patch.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This creates a simple specialized object allocator for basic
objects.
This avoids wasting space with malloc overhead (metadata and
extra alignment), since the specialized allocator knows the
alignment, and that objects, once allocated, are never freed.
It also allows us to track some basic statistics about object
allocations. For example, for the mozilla import, it shows
object usage as follows:
blobs: 627629 (14710 kB)
trees: 1119035 (34969 kB)
commits: 196423 (8440 kB)
tags: 1336 (46 kB)
and the simpler allocator shaves off about 2.5% off the memory
footprint off a "git-rev-list --all --objects", and is a bit
faster too.
[ Side note: this concludes the series of "save memory in object storage".
The thing is, there simply isn't much more to be saved on the objects.
Doing "git-rev-list --all --objects" on the mozilla archive has a final
total RSS of 131498 pages for me: that's about 513MB. Of that, the
object overhead is now just 56MB, the rest is going somewhere else (put
another way: the fact that this patch shaves off 2.5% of the total
memory overhead, considering that objects are now not much more than 10%
of the total shows how big the wasted space really was: this makes
object allocations much more memory- and time-efficient).
I haven't looked at where the rest is, but I suspect the bulk of it is
just the pack-file loading. It may be that we should pack the tree
objects separately from the blob objects: for git-rev-list --objects, we
don't actually ever need to even look at the blobs, but since trees and
blobs are interspersed in the pack-file, we end up not being dense in
the tree accesses, so we end up looking at more pages than we strictly
need to.
So with a 535MB pack-file, it's entirely possible - even likely - that
most of the remaining RSS is just the mmap of the pack-file itself. We
don't need to map in _all_ of it, but we do end up mapping a fair
amount. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Every single user actually wanted this only for commit objects, and we
have no reason to waste space on it for other object types. So just move
the structure member from the low-level "struct object" into the "struct
commit".
This leaves the commit object the same size, and removes one unnecessary
pointer from all other object allocations.
This shrinks memory usage (still at a fairly hefty half-gig, admittedly)
of "git-rev-list --all --objects" on the mozilla repo by another 5% in my
tests.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This shrinks "struct object" by a small amount, by getting rid of the
"struct type *" pointer and replacing it with a 3-bit bitfield instead.
In addition, we merge the bitfields and the "flags" field, which
incidentally should also remove a useless 4-byte padding from the object
when in 64-bit mode.
Now, our "struct object" is still too damn large, but it's now less
obviously bloated, and of the remaining fields, only the "util" (which is
not used by most things) is clearly something that should be eventually
discarded.
This shrinks the "git-rev-list --all" memory use by about 2.5% on the
kernel archive (and, perhaps more importantly, on the larger mozilla
archive). That may not sound like much, but I suspect it's more on a
64-bit platform.
There are other remaining inefficiencies (the parent lists, for example,
probably have horrible malloc overhead), but this was pretty obvious.
Most of the patch is just changing the comparison of the "type" pointer
from one of the constant string pointers to the appropriate new TYPE_xxx
small integer constant.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When --attach is not used, usually we do not say Content-Type:
and fluff, but if the commit message is not 7-bit ASCII, mark
it as "text/plain; charset=UTF-8". This unclutters output
somewhat.
Signed-off-by: Junio C Hamano <junkio@cox.net>
By convention, the commit message and the author/committer names
in the commit objects are UTF-8 encoded. When formatting for
e-mails, Q-encode them according to RFC 2047.
While we are at it, generate the content-type and
content-transfer-encoding headers as well.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch touches a couple of files, because it adds options to print a
custom text just after the subject of a commit, and just after the
diffstat.
[jc: made "many dashes" used as the boundary leader into a single
variable, to reduce the possibility of later tweaks to miscount the
number of dashes to break it.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* master: (119 commits)
diff family: add --check option
Document that "git add" only adds non-ignored files.
Add a conversion tool to migrate remote information into the config
fetch, pull: ask config for remote information
Fix build procedure for builtin-init-db
read-tree -m -u: do not overwrite or remove untracked working tree files.
apply --cached: do not check newly added file in the working tree
Implement a --dry-run option to git-quiltimport
Implement git-quiltimport
Revert "builtin-grep: workaround for non GNU grep."
builtin-grep: workaround for non GNU grep.
builtin-grep: workaround for non GNU grep.
git-am: use apply --cached
apply --cached: apply a patch without using working tree.
apply --numstat: show new name, not old name.
Documentation/Makefile: create tarballs for the man pages and html files
Allow pickaxe and diff-filter options to be used by git log.
Libify the index refresh logic
Builtin git-init-db
Remove unnecessary local in get_ref_sha1.
...
Still Work-in-progress git fmt-patch (should it be known as
format-patch-ng?) is matched with the fix made by Huw Davies
in 262a6ef76a commit to use
RFC2822 date format.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Updating "subject" variable without changing the hardcoded
number of bytes to memcpy from it would not help much.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This only does --stdout right now. To write into separate files
with pretty-printed filenames like the real thing does, it needs
a bit mroe work.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In addition to the existing comment support, that just allows the user
to use a convention that works pretty much everywhere else.
Signed-off-by: Yann Dirson <ydirson@altern.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Partly because we've messed up and now have some commits with trailing
whitespace, but partly because this also just simplifies the code, let's
remove trailing whitespace from the end when pretty-printing commits.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds --date-order to rev-list; it is similar to topo order
in the sense that no parent comes before all of its children,
but otherwise things are still ordered in the commit timestamp
order.
The same flag is also added to show-branch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Earlier, f2d4227530 commit broke Merge:
lines for unabbreviated case. Do not emit extra dots if we do not
abbreviate.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When displaying Merge: lines, we used to take the real commit
parents from the commit objects. Use the parsed parents from
the commit object instead, so that we honor fake parent
information from info/grafts.
Signed-off-by: Junio C Hamano <junkio@cox.net>