Commit Graph

111 Commits

Author SHA1 Message Date
Junio C Hamano
4a164d48df Merge branch 'jc/merge-base' (early part)
This contains an evil merge to fast-import, in order to
resolve in_merge_bases() update.
2007-02-13 16:54:35 -08:00
Eric Wong
b6936205e7 Disallow invalid --pretty= abbreviations
--pretty=o is a valid abbreviation, --pretty=omfg is not

Noticed by: Nicolas Vilz
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-02 21:18:59 -08:00
Junio C Hamano
f7e68b2967 Use log output encoding in --pretty=email headers.
Private functions add_rfc2047() and pretty_print_commit() assumed
they are only emitting UTF-8.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 10:10:20 -08:00
Junio C Hamano
03840fc32d Allow in_merge_bases() to take more than one reference commits.
The internal function in_merge_bases(A, B) is used to make sure
that commit A is an ancestor of commit B.  This changes the
signature of it to take an array of B's and updates its current
callers.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-09 17:57:03 -08:00
Junio C Hamano
1295228d1f merge-base: do not leak commit list
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 23:10:49 -08:00
Andy Whitcroft
93822c2239 short i/o: fix calls to write to use xwrite or write_in_full
We have a number of badly checked write() calls.  Often we are
expecting write() to write exactly the size we requested or fail,
this fails to handle interrupts or short writes.  Switch to using
the new write_in_full().  Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xwrite().

Note, the changes to config handling are much larger and handled
in the next patch in the sequence.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Lars Hjemli
f3a47405bb Skip excessive blank lines before commit body
This modifies pretty_print_commit() to make the output of git-rev-list and
friends a bit more predictable.

A commit body starting with blank lines might be unheard-of, but still possible
to create using git-commit-tree (so is bound to appear somewhere, sometime).

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-03 08:29:01 -08:00
Junio C Hamano
e90068a904 i18n: do not leak 'encoding' header even when we cheat the conversion.
We special case the case where encoding recorded in the commit
and the output encoding are the same and do not call iconv().
But we should drop 'encoding' header for this case as well for
consistency.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-31 18:55:28 -08:00
Junio C Hamano
53af9816bc i18n: drop "encoding" header in the output after re-coding.
After re-coding the commit message into the encoding the user
specified (either with core.logoutputencidng or --encoding
option), this drops the "encoding" header altogether.  The
output is after re-coding as the user asked (either with the
config or --encoding=<encoding> option), and the extra header
becomes redundant information.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-30 16:35:57 -08:00
Junio C Hamano
4b46e22d48 commit re-encoding: fix confusion between no and default conversion.
Telling the git-log family not to do any character conversion is
done with --encoding=none, which sets log_output_encoding to an
empty string.  However, logmsg_reencode() confused this with
log_output_encoding and commit_encoding set to NULL.  The latter
means we should use the default encoding (i.e. utf-8).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-30 02:18:24 -08:00
Junio C Hamano
eff73751bb Merge branch 'jc/utf8'
* jc/utf8:
  t3900: test conversion to non UTF-8 as well
  Rename t3900 test vector file
  UTF-8: introduce i18n.logoutputencoding.
  Teach log family --encoding
  i18n.logToUTF8: convert commit log message to UTF-8
  Move encoding conversion routine out of mailinfo to utf8.c

Conflicts:

	commit.c
2006-12-28 19:03:02 -08:00
Junio C Hamano
d2c11a38c4 UTF-8: introduce i18n.logoutputencoding.
It is plausible for somebody to want to view the commit log in a
different encoding from i18n.commitencoding -- the project's
policy may be UTF-8 and the user may be using a commit message
hook to run iconv to conform to that policy (and either not have
i18n.commitencoding to default to UTF-8 or have it explicitly
set to UTF-8).  Even then, Latin-1 may be more convenient for
the usual pager and the terminal the user uses.

The new variable i18n.logoutputencoding is used in preference to
i18n.commitencoding to decide what encoding to recode the log
output in when git-log and friends formats the commit log message.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-27 16:41:33 -08:00
Junio C Hamano
37818d7db0 Merge branch 'master' into js/shallow
This is to adjust to:

  count-objects -v: show number of packs as well.

which will break a test in this series.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-27 02:43:46 -08:00
Junio C Hamano
52883fbd76 Teach log family --encoding
Updated commit objects record the encoding used in their
encoding header.  This updates the log family to reencode it
into the encoding specified in i18n.commitencoding (or the
default, which is "utf-8") upon output.

To force a specific encoding that is different, log family takes
command line flag --encoding=<encoding>; giving --encoding=none
entirely disables the reencoding and lets you view log messges
in their original encoding.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-26 00:52:13 -08:00
Junio C Hamano
2ecd2bbcbe Move in_merge_bases() to commit.c
This reasonably useful function was hidden inside builtin-branch.c
2006-12-20 17:22:10 -08:00
Junio C Hamano
577ed5c20b rev-list --left-right
The output from "symmetric diff", i.e. A...B, does not
distinguish between commits that are reachable from A and the
ones that are reachable from B.  In this picture, such a
symmetric diff includes commits marked with a and b.

         x---b---b  branch B
        / \ /
       /   .
      /   / \
     o---x---a---a  branch A

However, you cannot tell which ones are 'a' and which ones are
'b' from the output.  Sometimes this is frustrating.  This adds
an output option, --left-right, to rev-list.

        rev-list --left-right A...B

would show ones reachable from A prefixed with '<' and the ones
reachable from B prefixed with '>'.

When combined with --boundary, boundary commits (the ones marked
with 'x' in the above picture) are shown with prefix '-', so you
would see list that looks like this:

    git rev-list --left-right --boundary --pretty=oneline A...B

    >bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb 3rd on b
    >bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb 2nd on b
    <aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 3rd on a
    <aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 2nd on a
    -xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1st on b
    -xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1st on a

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 10:35:28 -08:00
Johannes Schindelin
f53514bc2d allow deepening of a shallow repository
Now, by saying "git fetch -depth <n> <repo>" you can deepen
a shallow repository.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-24 15:42:49 -08:00
Johannes Schindelin
ed09aef06f support fetching into a shallow repository
A shallow commit is a commit which has parents, which in turn are
"grafted away", i.e. the commit appears as if it were a root.

Since these shallow commits should not be edited by the user, but
only by core git, they are recorded in the file $GIT_DIR/shallow.

A repository containing shallow commits is called shallow.

The advantage of a shallow repository is that even if the upstream
contains lots of history, your local (shallow) repository needs not
occupy much disk space.

The disadvantage is that you might miss a merge base when pulling
some remote branch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-24 15:42:49 -08:00
Eric Wong
14763e7bda commit: fix a segfault when displaying a commit with unreachable parents
I was running git show on various commits found by fsck-objects
when I found this bug.  Since find_unique_abbrev() cannot find
an abbreviation for an object not in the database, it will
return NULL, which is bad to run strlen() on.  So instead, we'll
just display the unabbreviated sha1 that we referenced in the
commit.

I'm not sure that this is the best 'fix' for it because the
commit I was trying to show was broken, but I don't think a
program should segfault even if the user tries to do something
stupid.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-11 18:10:38 -07:00
Jonas Fonseca
3dfb9278df Add --relative-date option to the revision interface
Exposes the infrastructure from 9a8e35e987.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-28 16:20:33 -07:00
Linus Torvalds
9a8e35e987 Relative timestamps in git log
I noticed that I was looking at the kernel gitweb output at some point
rather than just do "git log", simply because I liked seeing the
simplified date-format, ie the "5 days ago" rather than a full date.

This adds infrastructure to do that for "git log" too. It does NOT add the
actual flag to enable it, though, so right now this patch is a no-op, but
it should now be easy to add a command line flag (and possibly a config
file option) to just turn on the "relative" date format.

The exact cut-off points when it switches from days to weeks etc are
totally arbitrary, but are picked somewhat to avoid the "1 weeks ago"
thing (by making it show "10 days ago" rather than "1 week", or "70
minutes ago" rather than "1 hour ago").

[jc: with minor fix and tweak around "month" and "week" area.]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 19:12:03 -07:00
Tilman Sauerbeck
55c3eb434a Indentation fix.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:24:54 -07:00
David Rientjes
a89fccd281 Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.
Introduces global inline:

	hashcmp(const unsigned char *sha1, const unsigned char *sha2)

Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).

Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:23:53 -07:00
David Rientjes
96f1e58f52 remove unnecessary initializations
[jc: I needed to hand merge the changes to the updated codebase,
 so the result needs to be checked.]

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 21:22:20 -07:00
Junio C Hamano
5cd060b56c Merge branch 'lt/unitype'
* lt/unitype:
  builtin-prune.c: forgot TYPE => OBJ changes.
  Remove TYPE_* constant macros and use object_type enums consistently.
2006-07-14 15:39:19 -07:00
Robert Shearman
19b3bd3e2d format-patch: Generate a newline between the subject header and the message body
format-patch previously didn't generate a newline after a subject. This
caused the diffstat to not be displayed in messages with only one line
for the commit message.
This patch fixes this by adding a newline after the headers if a body
hasn't been added.
Signed-off-by: Robert Shearman <rob@codeweavers.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-13 21:40:43 -07:00
Linus Torvalds
1974632c66 Remove TYPE_* constant macros and use object_type enums consistently.
This updates the type-enumeration constants introduced to reduce
the memory footprint of "struct object" to match the type bits
already used in the packfile format, by removing the former
(i.e. TYPE_* constant macros) and using the latter (i.e. enum
object_type) throughout the code for consistency.

Eventually we can stop passing around the "type strings"
entirely, and this will help - no confusion about two different
integer enumeration.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-12 23:18:03 -07:00
Junio C Hamano
f324943816 merge-base: update the clean-up postprocessing
This removes the "contaminate the well even more" approach
taken in the current merge-base postprosessing code.  Instead,
when there are more than one merge-base results, we compute the
merge-base between them and see if one is a fast-forward of the
other, in which case the ancestor is removed from the result.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-09 03:38:12 -07:00
Junio C Hamano
58ecf5c1cd Re-fix clear_commit_marks().
Fix clear_commit_marks() enough to be usable in
get_merge_bases(), and retire now unused clear_object_marks().

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-04 17:45:22 -07:00
Junio C Hamano
160b798303 revert clear-commit-marks for now.
Earlier change broke "git describe A B" among other things.
Revert it for now, and clean the commits smudged by
get_merge_bases using clear_object_marks() function.  For
complex commit ancestry graph, this is way cheaper as well.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-03 03:05:20 -07:00
Junio C Hamano
2ef108013e get_merge_bases: clean up even when there is no common commit.
Actually in this case we would have traversed a lot of commits, so cleaning
things up is even more important.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-03 03:02:27 -07:00
Junio C Hamano
542ccefe89 commit.c: do not redefine UNINTERESTING bit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-02 11:34:17 -07:00
Rene Scharfe
c0fa8255c6 Fold get_merge_bases_clean() into get_merge_bases()
Change get_merge_bases() to be able to clean up after itself if
needed by adding a cleanup parameter.

We don't need to save the flags and restore them afterwards anymore;
that was a leftover from before the flags were moved out of the
range used in revision.c.  clear_commit_marks() sets them to zero,
which is enough.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-02 10:58:25 -07:00
Rene Scharfe
31aea7ef77 Make clear_commit_marks() clean harder
Don't care if objects have been parsed or not and don't stop when we
reach a commit that is already clean -- its parents could be dirty.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-01 18:14:03 -07:00
Rene Scharfe
31609c1725 Add get_merge_bases_clean()
Add get_merge_bases_clean(), a wrapper for get_merge_bases() that cleans
up after doing its work and make get_merge_bases() NOT clean up.
Single-shot programs like git-merge-base can use the dirty and fast
version.

Also move the object flags used in get_merge_bases() out of the range
defined in revision.h.  This fixes the "66ae0c77...ced9456a
89719209...262a6ef7" test of the ... operator which is introduced with
the next patch.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-01 18:13:25 -07:00
Johannes Schindelin
7c6f8aaf6d move get_merge_bases() to core lib.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-29 13:50:46 -07:00
Junio C Hamano
3b44f15a35 connect.c: check the commit buffer boundary while parsing.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-28 03:51:00 -07:00
Timo Hirvonen
554fe20d80 Make some strings const
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-28 03:24:37 -07:00
Junio C Hamano
0da4677149 fix rfc2047 formatter.
Running git-format-patch on patches from Lukas destroyed
the From: line.  This fixes it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-19 18:43:30 -07:00
Linus Torvalds
855419f764 Add specialized object allocator
This creates a simple specialized object allocator for basic
objects.

This avoids wasting space with malloc overhead (metadata and
extra alignment), since the specialized allocator knows the
alignment, and that objects, once allocated, are never freed.

It also allows us to track some basic statistics about object
allocations. For example, for the mozilla import, it shows
object usage as follows:

     blobs:   627629 (14710 kB)
     trees:  1119035 (34969 kB)
   commits:   196423  (8440 kB)
      tags:     1336    (46 kB)

and the simpler allocator shaves off about 2.5% off the memory
footprint off a "git-rev-list --all --objects", and is a bit
faster too.

[ Side note: this concludes the series of "save memory in object storage".
  The thing is, there simply isn't much more to be saved on the objects.

  Doing "git-rev-list --all --objects" on the mozilla archive has a final
  total RSS of 131498 pages for me: that's about 513MB. Of that, the
  object overhead is now just 56MB, the rest is going somewhere else (put
  another way: the fact that this patch shaves off 2.5% of the total
  memory overhead, considering that objects are now not much more than 10%
  of the total shows how big the wasted space really was: this makes
  object allocations much more memory- and time-efficient).

  I haven't looked at where the rest is, but I suspect the bulk of it is
  just the pack-file loading. It may be that we should pack the tree
  objects separately from the blob objects: for git-rev-list --objects, we
  don't actually ever need to even look at the blobs, but since trees and
  blobs are interspersed in the pack-file, we end up not being dense in
  the tree accesses, so we end up looking at more pages than we strictly
  need to.

  So with a 535MB pack-file, it's entirely possible - even likely - that
  most of the remaining RSS is just the mmap of the pack-file itself. We
  don't need to map in _all_ of it, but we do end up mapping a fair
  amount. ]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-19 18:42:21 -07:00
Linus Torvalds
d3ff6f5501 Move "void *util" from "struct object" into "struct commit"
Every single user actually wanted this only for commit objects, and we
have no reason to waste space on it for other object types. So just move
the structure member from the low-level "struct object" into the "struct
commit".

This leaves the commit object the same size, and removes one unnecessary
pointer from all other object allocations.

This shrinks memory usage (still at a fairly hefty half-gig, admittedly)
of "git-rev-list --all --objects" on the mozilla repo by another 5% in my
tests.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-17 18:49:45 -07:00
Linus Torvalds
885a86abe2 Shrink "struct object" a bit
This shrinks "struct object" by a small amount, by getting rid of the
"struct type *" pointer and replacing it with a 3-bit bitfield instead.

In addition, we merge the bitfields and the "flags" field, which
incidentally should also remove a useless 4-byte padding from the object
when in 64-bit mode.

Now, our "struct object" is still too damn large, but it's now less
obviously bloated, and of the remaining fields, only the "util" (which is
not used by most things) is clearly something that should be eventually
discarded.

This shrinks the "git-rev-list --all" memory use by about 2.5% on the
kernel archive (and, perhaps more importantly, on the larger mozilla
archive). That may not sound like much, but I suspect it's more on a
64-bit platform.

There are other remaining inefficiencies (the parent lists, for example,
probably have horrible malloc overhead), but this was pretty obvious.

Most of the patch is just changing the comparison of the "type" pointer
from one of the constant string pointers to the appropriate new TYPE_xxx
small integer constant.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-17 18:49:18 -07:00
Junio C Hamano
c831da6647 builtin format-patch: squelch content-type for 7-bit ASCII
When --attach is not used, usually we do not say Content-Type:
and fluff, but if the commit message is not 7-bit ASCII, mark
it as "text/plain; charset=UTF-8".  This unclutters output
somewhat.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 23:55:00 -07:00
Junio C Hamano
cdd406e389 CMIT_FMT_EMAIL: Q-encode Subject: and display-name part of From: fields.
By convention, the commit message and the author/committer names
in the commit objects are UTF-8 encoded.  When formatting for
e-mails, Q-encode them according to RFC 2047.

While we are at it, generate the content-type and
content-transfer-encoding headers as well.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 21:39:20 -07:00
Johannes Schindelin
698ce6f87e fmt-patch: Support --attach
This patch touches a couple of files, because it adds options to print a
custom text just after the subject of a commit, and just after the
diffstat.

[jc: made "many dashes" used as the boundary leader into a single
 variable, to reduce the possibility of later tweaks to miscount the
 number of dashes to break it.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 02:03:09 -07:00
Junio C Hamano
328b710d80 Merge branch 'master' into js/fmt-patch
* master: (119 commits)
  diff family: add --check option
  Document that "git add" only adds non-ignored files.
  Add a conversion tool to migrate remote information into the config
  fetch, pull: ask config for remote information
  Fix build procedure for builtin-init-db
  read-tree -m -u: do not overwrite or remove untracked working tree files.
  apply --cached: do not check newly added file in the working tree
  Implement a --dry-run option to git-quiltimport
  Implement git-quiltimport
  Revert "builtin-grep: workaround for non GNU grep."
  builtin-grep: workaround for non GNU grep.
  builtin-grep: workaround for non GNU grep.
  git-am: use apply --cached
  apply --cached: apply a patch without using working tree.
  apply --numstat: show new name, not old name.
  Documentation/Makefile: create tarballs for the man pages and html files
  Allow pickaxe and diff-filter options to be used by git log.
  Libify the index refresh logic
  Builtin git-init-db
  Remove unnecessary local in get_ref_sha1.
  ...
2006-05-21 01:34:54 -07:00
Eric Wong
6cdfd17974 commit: allow --pretty= args to be abbreviated
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-17 02:57:57 -07:00
Johannes Schindelin
596524b33d Teach fmt-patch about --numbered
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-05 14:11:57 -07:00
Junio C Hamano
2a38704323 Use RFC2822 dates from "git fmt-patch".
Still Work-in-progress git fmt-patch (should it be known as
format-patch-ng?) is matched with the fix made by Huw Davies
in 262a6ef76a commit to use
RFC2822 date format.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-01 01:44:33 -07:00
Junio C Hamano
53f420ef00 git-fmt-patch: thinkofix to show [PATCH] properly.
Updating "subject" variable without changing the hardcoded
number of bytes to memcpy from it would not help much.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 03:06:13 -07:00