We keep track of whether the user ident was given to us
explicitly, or if we guessed at it from system parameters
like username and hostname. However, we kept only a single
variable. This covers the common cases (because the author
and committer will usually come from the same explicit
source), but can miss two cases:
1. GIT_COMMITTER_* is set explicitly, but we fallback for
GIT_AUTHOR. We claim the ident is explicit, even though
the author is not.
2. GIT_AUTHOR_* is set and we ask for author ident, but
not committer ident. We will claim the ident is
implicit, even though it is explicit.
This patch uses two variables instead of one, updates both
when we set the "fallback" values, and updates them
individually when we read from the environment.
Rather than keep user_ident_sufficiently_given as a
compatibility wrapper, we update the only two callers to
check the committer_ident, which matches their intent and
what was happening already.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In v1.5.6-rc0~56^2 (2008-05-04) "user_ident_explicitly_given"
was introduced as a global for communication between config,
ident, and builtin-commit. In v1.7.0-rc0~72^2 (2010-01-07)
readers switched to using the common wrapper
user_ident_sufficiently_given(). After v1.7.11-rc1~15^2~18
(2012-05-21), the var is only written in ident.c.
Now we can make it static, which will enable further
refactoring without worrying about upsetting other code.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits made by ancient version of Git allowed committer without
human readable name, like this (00213b17c in the kernel history):
tree 6947dba41f8b0e7fe7bccd41a4840d6de6a27079
parent 352dd1df32e672be4cff71132eb9c06a257872fe
author Petr Baudis <pasky@ucw.cz> 1135223044 +0100
committer <sam@mars.ravnborg.org> 1136151043 +0100
kconfig: Remove support for lxdialog --checklist
...
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
When fed such a commit, --format='%ci' fails to parse it, and gives
back an empty string. Update the split_ident_line() to be a bit
more lenient when parsing, but make sure the caller that wants to
pick up sane value from its return value does its own validation.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixes quite a lot of brokenness when ident information needs to be taken
from the system and cleans up the code.
By Jeff King
* jk/ident-gecos-strbuf: (22 commits)
format-patch: do not use bogus email addresses in message ids
ident: reject bogus email addresses with IDENT_STRICT
ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT
format-patch: use GIT_COMMITTER_EMAIL in message ids
ident: let callers omit name with fmt_indent
ident: refactor NO_DATE flag in fmt_ident
ident: reword empty ident error message
format-patch: refactor get_patch_filename
ident: trim whitespace from default name/email
ident: use a dynamic strbuf in fmt_ident
ident: use full dns names to generate email addresses
ident: report passwd errors with a more friendly message
drop length limitations on gecos-derived names and emails
ident: don't write fallback username into git_default_name
fmt_ident: drop IDENT_WARN_ON_NO_NAME code
format-patch: use default email for generating message ids
ident: trim trailing newline from /etc/mailname
move git_default_* variables to ident.c
move identity config parsing to ident.c
fmt-merge-msg: don't use static buffer in record_person
...
If we come up with a hostname like "foo.(none)" because the
user's machine is not fully qualified, we should reject this
in strict mode (e.g., when we are making a commit object),
just as we reject an empty gecos username.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers who ask for ERROR_ON_NO_NAME are not so much
concerned that the name will be blank (because, after all,
we will fall back to using the username), but rather it is a
check to make sure that low-quality identities do not end up
in things like commit messages or emails (whereas it is OK
for them to end up in things like reflogs).
When future commits add more quality checks on the identity,
each of these callers would want to use those checks, too.
Rather than modify each of them later to add a new flag,
let's refactor the flag.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most callers want to see all of "$name <$email> $date", but
a few want only limited parts, omitting the date, or even
the name. We already have IDENT_NO_DATE to handle the date
part, but there's not a good option for getting just the
email. Callers have to done one of:
1. Call ident_default_email; this does not respect
environment variables, nor does it promise to trim
whitespace or other crud from the result.
2. Call git_{committer,author}_info; this returns the name
and email, leaving the caller to parse out the wanted
bits.
This patch adds IDENT_NO_NAME; it stops short of adding
IDENT_NO_EMAIL, as no callers want it (nor are likely to),
and it complicates the error handling of the function.
When no name is requested, the angle brackets (<>) around
the email address are also omitted.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a short-hand, we extract this flag into the local
variable "name_addr_only". It's more accurate to simply
negate this and refer to it as "want_date", which will be
less confusing when we add more NO_* flags.
While we're touching this part of the code, let's move the
call to ident_default_date() only when we are actually going
to use it, not when we have NO_DATE set, or when we get a
date from the environment.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's on point in printing the name, since it is by
definition the empty string if we have reached this code
path. Instead, let's be more clear that we are complaining
about the empty name, but still show the email address that
it is attached to (since that may provide some context to
the user).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 4b340cf split the logic to parse an ident line out of
pretty.c's format_person_part. But in doing so, it
accidentally introduced an off-by-one error that caused it
to think that single-character names were invalid.
This manifested itself as the "%an" format failing to show
anything at all for a single-character name.
Reported-by: Brian Turner <bturner@atlassian.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually these values get fed to fmt_ident, which will trim
any cruft anyway, but there are a few code paths which use
them directly. Let's clean them up for the benefit of those
callers. Furthermore, fmt_ident will look at the pre-trimmed
value and decide whether to invoke ERROR_ON_NO_NAME; this
check can be fooled by a name consisting only of spaces.
Note that we only bother to clean up when we are pulling the
information from gecos or from system files. Any other value
comes from a config file, where we will have cleaned up
accidental whitespace already.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we accept arbitrary-sized names and email
addresses, the only remaining limit is in the actual
formatting of the names into a buffer. The current limit is
1000 characters, which is not likely to be reached, but
using a strbuf is one less error condition we have to worry
about.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we construct an email address from the username and
hostname, we generate the host part of the email with this
procedure:
1. add the result of gethostname
2. if it has a dot, ok, it's fully qualified
3. if not, then look up the unqualified hostname via
gethostbyname; take the domain name of the result and
append it to the hostname
Step 3 can actually produce a bogus result, as the name
returned by gethostbyname may not be related to the hostname
we fed it (e.g., consider a machine "foo" with names
"foo.one.example.com" and "bar.two.example.com"; we may have
the latter returned and generate the bogus name
"foo.two.example.com").
This patch simply uses the full hostname returned by
gethostbyname. In the common case that the first part is the
same as the unqualified hostname, the behavior is identical.
And in the case that it is not the same, we are much more
likely to be generating a valid name.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When getpwuid fails, we give a cute but cryptic message.
While it makes sense if you know that getpwuid or identity
functions are being called, this code is triggered behind
the scenes by quite a few git commands these days (e.g.,
receive-pack on a remote server might use it for a reflog;
the current message is hard to distinguish from an
authentication error). Let's switch to something that gives
a little more context.
While we're at it, we can factor out all of the
cut-and-pastes of the "you don't exist" message into a
wrapper function. Rather than provide xgetpwuid, let's make
it even more specific to just getting the passwd entry for
the current uid. That's the only way we use getpwuid anyway,
and it lets us make an even more specific error message.
The current message also fails to mention errno. While the
usual cause for getpwuid failing is that the user does not
exist, mentioning errno makes it easier to diagnose these
problems. Note that POSIX specifies that errno remain
untouched if the passwd entry does not exist (but will be
set on actual errors), whereas some systems will return
ENOENT or similar for a missing entry. We handle both cases
in our wrapper.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we pull the user's name from the GECOS field of the
passwd file (or generate an email address based on their
username and hostname), we put the result into a
static buffer. While it's extremely unlikely that anybody
ever hit these limits (after all, in such a case their
parents must have hated them), we still had to deal with the
error cases in our code.
Converting these static buffers to strbufs lets us simplify
the code and drop some error messages from the documentation
that have confused some users.
The conversion is mostly mechanical: replace string copies
with strbuf equivalents, and access the strbuf.buf directly.
There are a few exceptions:
- copy_gecos and copy_email are the big winners in code
reduction (since they no longer have to manage the
string length manually)
- git_ident_config wants to replace old versions of
the default name (e.g., if we read the config multiple
times), so it must reset+add to the strbuf instead of
just adding
Note that there is still one length limitation: the
gethostname interface requires us to provide a static
buffer, so we arbitrarily choose 1024 bytes for the
hostname.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fmt_ident function gets a flag that tells us whether to
die if the name field is blank. If it is blank and we don't
die, then we fall back to the username from the passwd file.
The current code writes the value into git_default_name.
However, that's not necessarily correct, as the empty value
might have come from git_default_name, or it might have been
passed in. This leads to two potential problems:
1. If we are overriding an empty name in the passed-in
value, then we may be overwriting a perfectly good name
(from gitconfig or gecos) in the git_default_name
buffer. Later calls to fmt_ident will end up using the
fallback name, even though a better name was available.
2. If we override an empty gecos name, we end up with the
fallback name in git_default_name. A later call that
uses IDENT_ERROR_ON_NO_NAME will see the fallback name
and think that it is a good name, instead of producing
an error. In other words, a blank gecos name would
cause an error with this code:
git_committer_info(IDENT_ERROR_ON_NO_NAME);
but not this:
git_committer_info(0);
git_committer_info(IDENT_ERROR_ON_NO_NAME);
because in the latter case, the first call has polluted
the name buffer.
Instead, let's make the fallback a per-invocation variable.
We can just use the pw->pw_name string directly, since it
only needs to persist through the rest of the function (and
we don't do any other getpwent calls).
Note that while this solves (1) for future invocations of
fmt_indent, the current invocation might use the fallback
when it could in theory load a better value from
git_default_name. However, by not passing
IDENT_ERROR_ON_NO_NAME, the caller is indicating that it
does not care too much about the name, anyway, so we don't
bother; this is primarily about protecting future callers
who do care.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use fgets to read the /etc/mailname file, which means we
will typically end up with an extra newline in our
git_default_email. Most of the time this doesn't matter, as
fmt_ident will skip it as cruft, but there is one code path
that accesses it directly (in http-push.c:lock_remote).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no reason anybody outside of ident.c should access
these directly (they should use the new accessors which make
sure the variables are initialized), so we can make them
file-scope statics.
While we're at it, move user_ident_explicitly_given into
ident.c; while still globally visible, it makes more sense
to reside with the ident code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no reason for this to be in config, except that once
upon a time all of the config parsing was there. It makes
more sense to keep the ident code together.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function sets up the default name, email, and date, and
is not publicly available. Let's split it into three public
functions so that callers can get just the parts they need.
While we're at it, let's change the interface to simple
accessors. The original function was called only by fmt_ident,
and contained logic for "if we already have some other
value, don't load the default" which properly belongs in
fmt_ident.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit formatting logic format_person_part() in pretty.c
implements the logic to split an author/committer ident line into
its parts, intermixed with logic to compute its output using these
piece it computes.
Separate the former out to a helper function split_ident_line() so
that other codepath can use the same logic, and rewrite the function
using the helper function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid a getpwuid() call (which contacts the network if the password
database is not local), read of /etc/mailname, gethostname() call, and
reverse DNS lookup if the user has already chosen a name and email
through configuration, the environment, or the command line.
This should slightly speed up commands like "git commit". More
importantly, it improves error reporting when computation of the
default ident string does not go smoothly. For example, after
detecting a problem (e.g., "warning: cannot open /etc/mailname:
Permission denied") in retrieving the default committer identity:
touch /etc/mailname; # as root
chmod -r /etc/mailname; # as root
git commit -m 'test commit'
you can squelch the warning while waiting for your sysadmin to fix the
permissions problem.
echo '[user] email = me@example.com' >>~/.gitconfig
Inspired-by: Johannes Sixt <j6t@kdgb.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before falling back to gethostname(), check /etc/mailname if
GIT_AUTHOR_EMAIL is not set in the environment or through config
files. Only fall back if /etc/mailname cannot be opened or read.
The /etc/mailname convention comes from Debian policy section 11.6
("mail transport, delivery and user agents"), though maybe it could be
useful sometimes on other machines, too. The lack of this support was
noticed by various people in different ways:
- Ian observed that git was choosing the address
'ian@anarres.relativity.greenend.org.uk' rather than
'ian@davenant.greenend.org.uk' as it should have done.
- Jonathan noticed that operations like "git commit" were needlessly
slow when using a resolver that was slow to handle reverse DNS
lookups.
Alas, after this patch, if /etc/mailname is set up and the [user] name
and email configuration aren't, the committer email will not provide a
charming reminder of which machine commits were made on any more. But
I think it's worth it.
Mechanics: the functionality of reading mailname goes in its own
function, so people who care about other distros can easily add an
implementation to a similar location without making copy_email() too
long and losing clarity. While at it, we split out the fallback
default logic that does gethostname(), too (rearranging it a little
and adding a check for errors from gethostname while at it).
Based on a patch by Gerrit Pape <pape@smarden.org>.
Requested-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow NO_GECOS_IN_PWENT to be defined in the Makefile for platforms that
lack the pw_gecos field in their "struct passwd", in which case the
uppercased user name is used instead via the standard '&' replacement
mechanism.
Signed-off-by: Rafael Gieschke <rafael@gieschke.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
nlen has to be added to len when inserting (capitalized) pw_name as
substitution for "&" in pw_gecos. Otherwise, pw_gecos will be truncated
and data might be written beyond name+sz.
Signed-off-by: Rafael Gieschke <rafael@gieschke.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user gives "git commit --date=foobar", we silently
ignore the --date flag. We should note the error.
This patch puts the fix at the lowest level of fmt_ident,
which means it also handles GIT_AUTHOR_DATE=foobar, as well.
There are two down-sides to this approach:
1. Technically this breaks somebody doing something like
"git commit --date=now", which happened to work because
bogus data is the same as "now". Though we do
explicitly handle the empty string, so anybody passing
an empty variable through the environment will still
work.
If the error is too much, perhaps it can be downgraded
to a warning?
2. The error checking happens _after_ the commit message
is written, which can be annoying to the user. We can
put explicit checks closer to the beginning of
git-commit, but that feels a little hack-ish; suddenly
git-commit has to care about how fmt_ident works. Maybe
we could simply call fmt_ident earlier?
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compiling today's pu gave
...
CC ident.o
CC levenshtein.o
ident.c: In function 'fmt_ident':
ident.c:206: warning: format not a string literal and no format arguments
CC list-objects.o
...
This warning seems to have appeared first in 18e95f279e (ident.c:
remove unused variables) which removed additional fprintf arguments.
Suppress this warning by using fputs instead of fprintf.
Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The environment variable EMAIL has been honored since 28a94f8 (Fall back
to $EMAIL for missing GIT_AUTHOR_EMAIL and GIT_COMMITTER_EMAIL,
2007-04-28) as the end-user's wish to use the address as the identity.
When we use it, we should say we are explicitly given email by the user.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
bb1ae3f (commit: Show committer if automatic, 2008-05-04) added a logic to
check both name and email were given explicitly by the end user, but it
assumed that fmt_ident() is never called before git_default_user_config()
is called, which was fragile. The former calls setup_ident() and fills
the "default" name and email, so the check in the config parser would have
mistakenly said both are given even if only user.name was provided.
Make the logic more robust by keeping track of name and email separately.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Santi Béjar <santi@agolina.net>
d5cc2de (ident.c: Trim hint printed when gecos is empty., 2006-11-28)
reworded the message used as printf() format and dropped "%s" from it;
these two variables that hold the names of GIT_{AUTHOR,COMMITTER}_NAME
environment variables haven't been used since then.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For scripts using "git var -l" to read all logical variables at
once, not all per-variable warnings will be relevant. So suppress
them.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We remove crud characters at the beginning and end of real-names so that
when we see email addresses like
From: "David S. Miller" <davem@davemloft.net>
we drop the quotes around the name when we parse that and split it up into
name and email.
However, the list of crud characters was basically just a random list of
common things that are found around names, and it didn't contain the
backslash character that some insane scripts seem to use when quoting
things. So now the kernel has a number of authors listed like
Author: \"Rafael J. Wysocki\ <rjw@sisk.pl>
because the author name had started out as
From: \"Rafael J. Wysocki\" <rjw@sisk.pl>
and the only "crud" character we noticed and removed was the final
double-quote at the end.
We should probably do better quote removal from names anyway, but this is
the minimal obvious patch.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To warn the user in case he/she might be using an unintended
committer identity.
Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "config --global" suggested in the message is a valid one-shot fix,
and hopefully one-shot across machines that NFS mounts the home directories.
This knowledge can hopefully be reused when you are forced to use git on
Windows, but the fix based on GECOS would not be applicable, so
it is not such a useful hint to mention the exact reason why the
name cannot be determined.
Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier fix to the said commit was incomplete; it mixed up the
meaning of the flag parameter passed to the internal fmt_ident()
function, so this corrects it.
git_author_info() and git_committer_info() can be told to issue a
warning when no usable user information is found, and optionally can be
told to error out. Operations that actually use the information to
record a new commit or a tag will still error out, but the caller to
leave reflog record will just silently use bogus user information.
Not warning on misconfigured user information while writing a reflog
entry is somewhat debatable, but it is probably nicer to the users to
silently let it pass, because the only information you are losing is who
checked out the branch.
* git_author_info() and git_committer_info() used to take 1 (positive
int) to error out with a warning on misconfiguration; this is now
signalled with a symbolic constant IDENT_ERROR_ON_NO_NAME.
* These functions used to take -1 (negative int) to warn but continue;
this is now signalled with a symbolic constant IDENT_WARN_ON_NO_NAME.
* fmt_ident() function implements the above error reporting behaviour
common to git_author_info() and git_committer_info(). A symbolic
constant IDENT_NO_DATE can be or'ed in to the flag parameter to make
it return only the "Name <email@address.xz>".
* fmt_name() is a thin wrapper around fmt_ident() that always passes
IDENT_ERROR_ON_NO_NAME and IDENT_NO_DATE.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* kh/commit: (33 commits)
git-commit --allow-empty
git-commit: Allow to amend a merge commit that does not change the tree
quote_path: fix collapsing of relative paths
Make git status usage say git status instead of git commit
Fix --signoff in builtin-commit differently.
git-commit: clean up die messages
Do not generate full commit log message if it is not going to be used
Remove git-status from list of scripts as it is builtin
Fix off-by-one error when truncating the diff out of the commit message.
builtin-commit.c: export GIT_INDEX_FILE for launch_editor as well.
Add a few more tests for git-commit
builtin-commit: Include the diff in the commit message when verbose.
builtin-commit: fix partial-commit support
Fix add_files_to_cache() to take pathspec, not user specified list of files
Export three helper functions from ls-files
builtin-commit: run commit-msg hook with correct message file
builtin-commit: do not color status output shown in the message template
file_exists(): dangling symlinks do exist
Replace "runstatus" with "status" in the tests
t7501-commit: Add test for git commit <file> with dirty index.
...
Introduce fmt_name() specifically meant for formatting the name and
email pair, to add signed-off-by value. This reverts parts of
13208572fb (builtin-commit: fix --signoff)
so that an empty datestamp string given to fmt_ident() by mistake will
error out as before.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Signed-off-by: line contained a spurious timestamp. The reason was
a call to git_committer_info(1), which automatically added the
timestamp.
Instead, fmt_ident() was taught to interpret an empty string for the
date (as opposed to NULL, which still triggers the default behavior)
as "do not bother with the timestamp", and builtin-commit.c uses it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first thing we teach in the tutorial is to set the default
identity in $HOME/.gitconfig using "git config --global". The
suggestion in the error message should match the order, while
hinting that per repository identity can later be configured
differently.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The environment variable $EMAIL gives a better default of user's
preferred e-mail address than the hardcoded "username@hostname",
as it is understood by many existing programs.
We still honor GIT_*_EMAIL environment variables and user.email
configuration variable give them higher precedence, so that the
user can override $EMAIL or "username@hostname", as they are
likely to be more specific to the context of working on a
particular project.
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>