Commit Graph

34 Commits

Author SHA1 Message Date
Jonathan Tan
6b4b013f18 mailinfo: handle in-body header continuations
Mailinfo currently handles multi-line headers, but it does not handle
multi-line in-body headers. Teach it to handle such headers, for
example, for this input:

  From: author <author@example.com>
  Date: Fri, 9 Jun 2006 00:44:16 -0700
  Subject: a very long
   broken line

  Subject: another very long
   broken line

interpret the in-body subject to be "another very long broken line"
instead of "another very long".

An existing test (t/t5100/msg0015) has an indented line immediately
after an in-body header - it has been modified to reflect the new
functionality.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 10:23:11 -07:00
Jonathan Tan
9c5681da88 mailinfo: make is_scissors_line take plain char *
The is_scissors_line takes a struct strbuf * when a char * would
suffice. Make it take char *.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 14:40:36 -07:00
Jonathan Tan
334192b411 mailinfo: separate in-body header processing
The check_header function contains logic specific to in-body headers,
although it is invoked during both the processing of actual headers and
in-body headers. Separate out the in-body header part into its own
function.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 14:40:32 -07:00
Junio C Hamano
4a78871152 Merge branch 'rs/mailinfo-lib'
Small code clean-up.

* rs/mailinfo-lib:
  mailinfo: recycle strbuf in check_header()
2016-08-17 14:07:47 -07:00
René Scharfe
ecf30b237c mailinfo: recycle strbuf in check_header()
handle_message_id() duplicates the contents of the strbuf that is passed
to it.  Its only caller proceeds to release the strbuf immediately after
that.  Reuse it instead and make that change of object ownership more
obvious by inlining this short function.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-13 19:45:24 -07:00
Junio C Hamano
8f309aeb82 strbuf: introduce strbuf_getline_{lf,nul}()
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago.  No useful caller that uses other value has emerged.

By now, it is clear that the interface is overly broad without a
good reason.  Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.

This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them.  The changes contained in this patch are:

 * introduction of these two functions in strbuf.[ch]

 * mechanical conversion of all callers to strbuf_getline() with
   either '\n' or '\0' as the third parameter to instead call the
   respective thin wrapper.

After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller.  An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15 10:12:51 -08:00
Nguyễn Thái Ngọc Duy
85d9d9ddf3 mailinfo: fix passing wrong address to git_mailinfo_config
git_mailinfo_config() expects "struct mailinfo *". But in
setup_mailinfo(), "mi" is already "struct mailinfo *". &mi would make
it "struct mailinfo **" and git_mailinfo_config() would damage some
other memory when it assigns some value to mi->use_scissors.

This is caught by t4150.20. git_mailinfo_config() breaks
mi->name.alloc and makes strbuf_release() in clear_mailinfo() attempt
to free strbuf_slopbuf.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-01 10:29:40 -08:00
Junio C Hamano
6ac617a321 mailinfo: remove calls to exit() and die() deep in the callchain
The top-level mailinfo() would instead punt when the code in the
deeper part of the callchain detects an unrecoverable error in the
input.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:59:34 -07:00
Junio C Hamano
669b963af2 mailinfo: handle charset conversion errors in the caller
Instead of dying in convert_to_utf8(), just report an error and let
the callers handle it.  Between the two callers:

 - decode_header() silently punts when it cannot parse a broken
   RFC2047 encoded text (e.g. when it sees anything other than B or
   Q after it sees "=?<charset>") by jumping to release_return,
   returning the string it successfully parsed out so far, to the
   caller.  A piece of string that convert_to_utf8() cannot handle
   can be treated the same way.

 - handle_commit_msg() doesn't cope with a malformed line well, so
   die there for now.  We'll lift this even higher in later changes
   in this series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:59:34 -07:00
Junio C Hamano
c6905e45f0 mailinfo: libify
Move the bulk of the code from builtin/mailinfo.c to mailinfo.c
so that new callers can start calling mailinfo() directly.

Note that a few calls to exit() and die() need to be cleaned up
for the API to be truly useful, which will come in later steps.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:59:34 -07:00
Lukas Sandström
34488e3c37 Make git-mailinfo a builtin
[jc: with a bit of constness tightening]

Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-18 22:10:28 -07:00
Junio C Hamano
ae448e3854 mailinfo: ignore blanks after in-body headers.
[jc: this is based on Eric's patch but also fixes up the parsed
 subject headers].

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-17 17:05:36 -07:00
Eric W. Biederman
2662dbfa58 Don't parse any headers in the real body of an email message.
It was pointed out that the current behaviour might mispart a patch comment
so remove this behaviour for now.

[jc: this fixes "From: line in the middle" check in t5100 test.]

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-17 16:27:12 -07:00
Junio C Hamano
d177e58425 Merge branch 'jc/mailinfo'
* jc/mailinfo:
  mailinfo: skip bogus UNIX From line inside body
2006-05-28 13:39:05 -07:00
Junio C Hamano
ef29c11702 mailinfo: More carefully parse header lines in read_one_header_line()
We exited prematurely from header parsing loop when the header
field did not have a space after the colon but we insisted on
it, and we got the check wrong because we forgot that we strip
the trailing whitespace before we do the check.

The space after the colon is not even required by RFC2822, so
stop requiring it.  While we are at it, the header line is
specified to be more strict than "anything with a colon in it"
(there must be one or more characters before the colon, and they
must not be controls, SP or non US-ASCII), so implement that
check as well, lest we mistakenly think something like:

	Bogus not a header line: this is not.

as a header line.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-26 00:49:36 -07:00
Eric W. Biederman
2dec02b1ec Allow in body headers beyond the in body header prefix.
- handle_from is fixed to not mangle it's input line.

- Then handle_inbody_header is allowed to look in
  the body of a commit message for additional headers
  that we haven't already seen.

This allows patches with all of the right information in
unfortunate places to be imported.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 14:11:03 -07:00
Eric W. Biederman
f30b20282b More accurately detect header lines in read_one_header_line
Only count lines of the form '^.*: ' and '^From ' as email
header lines.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 14:08:32 -07:00
Eric W. Biederman
1f36bee67e In handle_body only read a line if we don't already have one.
This prepares for detecting non-email patches that don't have
mail headers.  In which case we have already read the first
line so handle_body should not ignore it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 14:07:42 -07:00
Eric W. Biederman
8b4525fb3c Refactor commit messge handling.
- Move handle_info into main so it is called once
  after everything has been parsed.  This allows the removal
  of a static variable and removes two duplicate calls.

- Move parsing of inbody headers into handle_commit.
  This means we parse the in-body headers after we have decoded
  the character set, and it removes code duplication between
  handle_multipart_one_part and handle_body.

- Change the flag indicating that we have seen an in body
  prefix header into another bit in seen.
  This is a little more general and allows the possibility of parsing
  in body headers after the body message has begun.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 14:04:47 -07:00
Eric W. Biederman
3350453014 Move B and Q decoding into check header.
B and Q decoding is not appropriate for in body headers, so move
it up to where we explicitly know we have a real email header.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 14:01:59 -07:00
Eric W. Biederman
f8128cfb8d Make read_one_header_line return a flag not a length.
Currently we only use the return value from read_one_header line
to tell if the line we have read is a header or not.  So make
it a flag.  This paves the way for better email detection.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 14:00:15 -07:00
Junio C Hamano
81c5cf7865 mailinfo: skip bogus UNIX From line inside body
Sometimes people just include the whole format-patch output in
the commit e-mail.  Detect it and skip the bogus ">From " line.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 17:37:46 -07:00
Junio C Hamano
757319309a mailinfo: decode underscore used in "Q" encoding properly.
Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are
subtly different; the latter is used on the mail header and an
underscore needs to be decoded to 0x20.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-21 00:09:28 -07:00
Fernando J. Pereda
b6e56eca8a Allow building Git in systems without iconv
Systems using some uClibc versions do not properly support
iconv stuff. This patch allows Git to be built on those
systems by passing NO_ICONV=YesPlease to make. The only
drawback is mailinfo won't do charset conversion in those
systems.

Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-16 01:42:58 -08:00
Junio C Hamano
8bc5c04a71 [PATCH] mailinfo: reset CTE after each multipart
If the first part uses quoted-printable to protect iso8859-1
name in the commit log, and the second part was plain ascii text
patchfile without even Content-Transfer-Encoding subheader, we
incorrectly tried to decode the patch as quoted printable.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-06 21:37:53 -08:00
Junio C Hamano
ac44f3e7c0 mailinfo: iconv does not like "latin-1" -- should spell it "latin1"
This was a stupid typo that did not follow

	http://www.iana.org/assignments/character-sets

Long noticed but neglected by JC, but finally reported by
Marco.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-23 23:56:52 -08:00
Junio C Hamano
e0e3ba208d mailinfo and git-am: allow "John Doe <johndoe>"
An isolated developer could have a local-only e-mail, which will
be stripped out by mailinfo because it lacks '@'.  Define a
fallback parser to accomodate that.

At the same time, reject authorless patch in git-am.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-14 16:31:06 -08:00
Jason Riedy
a6da9395a5 [PATCH] Initial AIX portability fixes.
Added an AIX clause in the Makefile; that clause likely
will be wrong for any AIX pre-5.2, but I can only test
on 5.3.  mailinfo.c was missing the compat header file,
and convert-objects.c needs to define a specific
_XOPEN_SOURCE as well as _XOPEN_SOURCE_EXTENDED.

Signed-off-by: E. Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-06 16:15:55 -08:00
Junio C Hamano
4050c0df8e Clean up compatibility definitions.
This attempts to clean up the way various compatibility
functions are defined and used.

 - A new header file, git-compat-util.h, is introduced.  This
   looks at various NO_XXX and does necessary function name
   replacements, equivalent of -Dstrcasestr=gitstrcasestr in the
   Makefile.

 - Those function name replacements are removed from the Makefile.

 - Common features such as usage(), die(), xmalloc() are moved
   from cache.h to git-compat-util.h; cache.h includes
   git-compat-util.h itself.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-05 15:50:29 -08:00
Junio C Hamano
9f63892b38 mailinfo: Do not use -u=<encoding>; say --encoding=<encoding>
Specifying the value for a single letter, single dash option
parameter with equal sign looked funny, and more importantly
calling the flag to override encoding from utf-8 to something
else "-u" (obviously abbreviated from "utf-8") did not make any
sense.  So spell it out.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-28 01:29:52 -08:00
Junio C Hamano
f1f909e318 mailinfo: Use i18n.commitencoding
This uses i18n.commitencoding configuration item to pick up the
default commit encoding for the repository when converting form
e-mail encoding to commit encoding (the default is utf8).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-27 16:29:38 -08:00
Junio C Hamano
650e4be59b mailinfo: allow -u to fall back on latin1 to utf8 conversion.
When the message body does not identify what encoding it is in,
-u assumes it is in latin-1 and converts it to utf8, which is
the recommended encoding for git commit log messages.

With -u=<encoding>, the conversion is made into the specified
one, instead of utf8, to allow project-local policies.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-27 16:26:50 -08:00
Junio C Hamano
e1e9c25466 Give proper prototype to gitstrcasestr.
Borrow from NO_MMAP patch by Johannes, squelch compiler warnings by
declaring gitstrcasestr() when we use it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-08 14:54:41 -07:00
Junio C Hamano
597c9cc540 Flatten tools/ directory to make build procedure simpler.
Also make platform specific part more isolated.  Currently we only
have Darwin defined, but I've taken a look at SunOS specific patch
(which I dropped on the floor for now) as well.  Doing things this way
would make adding it easier.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-07 12:22:56 -07:00