When we are parsing approxidate strings and we find three
numbers separate by one of ":/-.", we guess that it may be a
date. We feed the numbers to match_multi_number, which
checks whether it makes sense as a date in various orderings
(e.g., dd/mm/yy or mm/dd/yy, etc).
One of the checks we do is to see whether it is a date more
than 10 days in the future. This was added in 38035cf (date
parsing: be friendlier to our European friends.,
2006-04-05), and lets us guess that if it is currently April
2014, then "10/03/2014" is probably March 10th, not October
3rd.
This has a downside, though; if you want to be overly
generous with your "--until" date specification, we may
wrongly parse "2014-12-01" as "2014-01-12" (because the
latter is an in-the-past date). If the year is a future year
(i.e., both are future dates), it gets even weirder. Due to
the vagaries of approxidate, months _after_ the current date
(no matter the year) get flipped, but ones before do not.
This patch drops the "in the future" check for dates of this
form, letting us treat them always as yyyy-mm-dd, even if
they are in the future. This does not affect the normal
dd/mm/yyyy versus mm/dd/yyyy lookup, because this code path
only kicks in when the first number is greater than 70
(i.e., it must be a year, and cannot be either a date or a
month).
The one possible casualty is that "yyyy-dd-mm" is less
likely to be chosen over "yyyy-mm-dd". That's probably OK,
though because:
1. The difference happens only when the date is in the
future. Already we prefer yyyy-mm-dd for dates in the
past.
2. It's unclear whether anybody even uses yyyy-dd-mm
regularly. It does not appear in lists of common date
formats in Wikipedia[1,2].
3. Even if (2) is wrong, it is better to prefer ISO-like
dates, as that is consistent with what we use elsewhere
in git.
[1] http://en.wikipedia.org/wiki/Date_and_time_representation_by_country
[2] http://en.wikipedia.org/wiki/Calendar_date
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The approxidate functions accept an extra "now" parameter to
avoid calling time() themselves. We use this in our test
suite to make sure we have a consistent time for computing
relative dates. However, deep in the bowels of approxidate,
we also call time() to check whether possible dates are far
in the future. Let's make sure that the "now" override makes
it to that spot, too, so we can consistently test that
feature.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* jk/commit-author-parsing:
determine_author_info(): copy getenv output
determine_author_info(): reuse parsing functions
date: use strbufs in date-formatting functions
record_author_date(): use find_commit_header()
record_author_date(): fix memory leak on malformed commit
commit: provide a function to find a header in a buffer
Git's "ISO" date format does not really conform to the ISO 8601
standard due to small differences, and it cannot be parsed by ISO
8601-only parsers, e.g. those of XML toolchains.
The output from "--date=iso" deviates from ISO 8601 in these ways:
- a space instead of the `T` date/time delimiter
- a space between time and time zone
- no colon between hours and minutes of the time zone
Add a strict ISO 8601 date format for displaying committer and
author dates. Use the '%aI' and '%cI' format specifiers and add
'--date=iso-strict' or '--date=iso8601-strict' date format names.
See http://thread.gmane.org/gmane.comp.version-control.git/255879 and
http://thread.gmane.org/gmane.comp.version-control.git/52414/focus=52585
for discussion.
Signed-off-by: Beat Bolli <bbolli@ewanet.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many of the date functions write into fixed-size buffers.
This is a minor pain, as we have to take special
precautions, and frequently end up copying the result into a
strbuf or heap-allocated buffer anyway (for which we
sometimes use strcpy!).
Let's instead teach parse_date, datestamp, etc to write to a
strbuf. The obvious downside is that we might need to
perform a heap allocation where we otherwise would not need
to. However, it turns out that the only two new allocations
required are:
1. In test-date.c, where we don't care about efficiency.
2. In determine_author_info, which is not performance
critical (and where the use of a strbuf will help later
refactoring).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Comment for l10n translators can not be extracted by xgettext if it
is not right above the l10n tag. Moving the comment right before
the l10n tag will fix this issue.
Reported-by: Brian Gesiak <modocache@gmail.com>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tighten codepaths that parse timestamps in commit objects.
* jk/commit-dates-parsing-fix:
show_ident_date: fix tz range check
log: do not segfault on gmtime errors
log: handle integer overflow in timestamps
date: check date overflow against time_t
fsck: report integer overflow in author timestamps
t4212: test bogus timestamps with git-log
Many code paths assume that show_date and show_ident_date
cannot return NULL. For the most part, we handle missing or
corrupt timestamps by showing the epoch time t=0.
However, we might still return NULL if gmtime rejects the
time_t we feed it, resulting in a segfault. Let's catch this
case and just format t=0.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we check whether a timestamp has overflowed, we check
only against ULONG_MAX, meaning that strtoul has overflowed.
However, we also feed these timestamps to system functions
like gmtime, which expect a time_t. On many systems, time_t
is actually smaller than "unsigned long" (e.g., because it
is signed), and we would overflow when using these
functions. We don't know the actual size or signedness of
time_t, but we can easily check for truncation with a simple
assignment.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The extra semi-colon is harmless, since we really do want
the while loop to do nothing. But it does trigger a warning
from clang.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used the approxidate() parser for "--expire=<timestamp>" options
of various commands, but it is better to treat --expire=all and
--expire=now a bit more specially than using the current timestamp.
Update "git gc" and "git reflog" with a new parsing function for
expiry dates.
* jc/prune-all:
prune: introduce OPT_EXPIRY_DATE() and use it
api-parse-options.txt: document "no-" for non-boolean options
git-gc.txt, git-reflog.txt: document new expiry options
date.c: add parse_expiry_date()
"git reflog --expire=all" tries to expire reflog entries up to the
current second, because the approxidate() parser gives the current
timestamp for anything it does not understand (and it does not know
what time "all" means). When the user tells us to expire "all" (or
set the expiration time to "now"), the user wants to remove all the
reflog entries (no reflog entry should record future time).
Just set it to ULONG_MAX and to let everything that is older that
timestamp expire.
While at it, allow "now" to be treated the same way for callers that
parse expiry date timestamp with this function. Also use an error
reporting version of approxidate() to report misspelled date. When
the user says e.g. "--expire=mnoday" to delete entries two days or
older on Wednesday, we wouldn't want the "unknown, default to now"
logic to kick in.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix time offset calculation expression in case if time_t
is unsigned. This code works fine for signed and
unsigned time_t.
Signed-off-by: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
tm_to_time_t() returns (time_t)-1 when it sees an error. On
platforms with unsigned time_t, this value will be larger than any
valid timestamp and will break the "Is this older than 10 days in
the future?" check.
Signed-off-by: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 1.7.9 era, we taught "git rebase" about the raw timestamp format
but we did not teach the same trick to "filter-branch", which rolled
a similar logic on its own. Because of this, "filter-branch" failed
to rewrite commits with ancient timestamps.
* jc/maint-filter-branch-epoch-date:
t7003: add test to filter a branch with a commit at epoch
date.c: Fix off by one error in object-header date parsing
filter-branch: do not forget the '@' prefix to force git-timestamp
It is perfectly OK for a valid decimal integer to begin with '9' but
116eb3a (parse_date(): allow ancient git-timestamp, 2012-02-02) did
not express the range correctly.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only place that the issue this series addresses was observed
where we read "cat-file commit" output and put it in GIT_AUTHOR_DATE
in order to replay a commit with an ancient timestamp.
With the previous patch alone, "git commit --date='20100917 +0900'"
can be misinterpreted to mean an ancient timestamp, not September in
year 2010. Guard this codepath by requring an extra '@' in front of
the raw git timestamp on the parsing side. This of course needs to
be compensated by updating get_author_ident_from_commit and the code
for "git commit --amend" to prepend '@' to the string read from the
existing commit in the GIT_AUTHOR_DATE environment variable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The date-time parser parses out a human-readble datestring piece by
piece, so that it could even parse a string in a rather strange
notation like 'noon november 11, 2005', but restricts itself from
parsing strings in "<seconds since epoch> <timezone>" format only
for reasonably new timestamps (like 1974 or newer) with 10 or more
digits. This is to prevent a string like "20100917" from getting
interpreted as seconds since epoch (we want to treat it as September
17, 2010 instead) while doing so.
The same codepath is used to read back the timestamp that we have
already recorded in the headers of commit and tag objects; because
of this, such a commit with timestamp "0 +0000" cannot be rebased or
amended very easily.
Teach parse_date() codepath to special case a string of the form
"<digits> +<4-digits>" to work this issue around, but require that
there is no other cruft around the string when parsing a timestamp
of this format for safety.
Note that this has a slight backward incompatibility implications.
If somebody writes "git commit --date='20100917 +0900'" and wants it
to mean a timestamp in September 2010 in Japan, this change will
break such a use case.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Timezone designators in the following formats are all valid according to
ISO8601:2004, section 4.3.2:
[+-]hh, [+-]hhmm, [+-]hh:mm
but we have ignored the ones with colon so far.
Signed-off-by: Haitao Li <lihaitao@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When relative dates are more than about a year ago, we start
writing them as "Y years, M months". At the point where we
calculate Y and M, we have the time delta specified as a
number of days. We calculate these integers as:
Y = days / 365
M = (days % 365 + 15) / 30
This rounds days in the latter half of a month up to the
nearest month, so that day 16 is "1 month" (or day 381 is "1
year, 1 month").
We don't round the year at all, though, meaning we can end
up with "1 year, 12 months", which is silly; it should just
be "2 years".
Implement this differently with months of size
onemonth = 365/12
so that
totalmonths = (long)( (days + onemonth/2)/onemonth )
years = totalmonths / 12
months = totalmonths % 12
In order to do this without floats, we write the first formula as
totalmonths = (days*12*2 + 365) / (365*2)
Tests and inspiration by Jeff King.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
approxidate() is not appropriate for reading machine-written dates
because it guesses instead of erroring out on malformed dates.
parse_date() is less convenient since it returns its output as a
string. So export the underlying function that writes a timestamp.
While at it, change the return value to match the usual convention:
return 0 for success and -1 for failure.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When no timezone is specified, we deduce the offset by
subtracting the result of mktime from our calculated
timestamp.
However, our timestamp is stored as an unsigned integer,
meaning we perform the subtraction as unsigned. For a
negative offset, this means we wrap to a very high number,
and our numeric timezone is in the millions of hours. You
can see this bug by doing:
$ TZ=EST \
GIT_AUTHOR_DATE='2010-06-01 10:00' \
git commit -a -m foo
$ git cat-file -p HEAD | grep author
author Jeff King <peff@peff.net> 1275404416 +119304128
Instead, we should perform this subtraction as a time_t, the
same type that mktime returns.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
approxidate_relative and approxidate_careful both use parse_date to
dump the timestamp to a character buffer and parse it back into a long
unsigned using strtoul(). Avoid doing this by creating a new
parse_date_toffset method.
Noticed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name "Z" for the UTC timezone is required to properly parse ISO 8601
timestamps. Add it to the list of recognized timezones.
Because timezone names can be shorter than 3 letters, loosen the
restriction in match_alpha() that used to require at least 3 letters to
match to allow a short timezone name as long as it matches exactly. Prior
to the introduction of the "Z" zone, this already affected the timezone
"NT" (Nome).
Signed-off-by: Marcus Comstedt <marcus@mc.pp.se>
Reviewed-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-reflog-bad-timestamp:
t0101: use a fixed timestamp when searching in the reflog
Update @{bogus.timestamp} fix not to die()
approxidate_careful() reports errorneous date string
For a long time, the time based reflog syntax (e.g. master@{yesterday})
didn't complain when the "human readable" timestamp was misspelled, as
the underlying mechanism tried to be as lenient as possible. The funny
thing was that parsing of "@{now}" even relied on the fact that anything
not recognized by the machinery returned the current timestamp.
Introduce approxidate_careful() that takes an optional pointer to an
integer, that gets assigned 1 when the input does not make sense as a
timestamp.
As I am too lazy to fix all the callers that use approxidate(), most of
the callers do not take advantage of the error checking, but convert the
code to parse reflog to use it as a demonstration.
Tests are mostly from Jeff King.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes '--relative-date' so that it does not give '0
year, 12 months', for the interval 360 <= diff < 365.
Signed-off-by: Johan Sageryd <j416@1616.se>
Signed-off-by: Jeff King <peff@peff.net>
These were broken by b5373e9. The problem is that the code
marks the month and year with "-1" for "we don't know it
yet", but the month and year code paths were not adjusted to
fill in the current time before doing their calculations
(whereas other units follow a different code path and are
fine).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main purpose is to allow predictable testing of the code.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous patch to improve approxidate got us to the point that a lot
of the remaining annoyances were due to the 'strict' date handling running
first, and deciding that it got a good enough date that the approximate
date routines were never even invoked.
For example, using a date string like
6AM, June 7, 2009
the strict date logic would be perfectly happy with the "June 7, 2009"
part, and ignore the 6AM part that it didn't understand - resulting in the
information getting dropped on the floor:
6AM, June 7, 2009 -> Sat Jun 6 00:00:00 2009
and the date being calculated as if it was midnight, and the '6AM' having
confused the date routines into thinking about '6 June' rather than 'June
7' at 6AM (ie notice how the _day_ was wrong due to this, not just the
time).
So this makes the strict date routines a bit stricter, and requires that
not just the date, but also the time, has actually been parsed. With that
fix, and trivial extension of the approxidate routines, git now properly
parses the date as
6AM, June 7, 2009 -> Sun Jun 7 06:00:00 2009
without dropping the fuzzy time ("6AM" or "noon" or any of the other
non-strict time formats) on the floor.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is not a new failure mode - approxidate has always been kind of
random in the input it accepts, but some of the randomness is more
irritating than others.
For example:
Jun 6, 5AM -> Mon Jun 22 05:00:00 2009
5AM Jun 6 -> Sat Jun 6 05:00:00 2009
Whaa? The reason for the above is that approxidate squirrells away the '6'
from "Jun 6" to see if it's going to be a relative number, and then
forgets about it when it sees a new number (the '5' in '5AM'). So the odd
"June 22" date is because today is July 22nd, and if it doesn't have
another day of the month, it will just pick todays mday - having ignored
the '6' entirely due to getting all excited about seeing a new number (5).
There are other oddnesses. This does not fix them all, but I think it
makes for fewer _really_ perplexing cases. At least now we have
Jun 6, 5AM -> Sat Jun 6 05:00:00 2009
5AM, Jun 6 -> Sat Jun 6 05:00:00 2009
which makes me happier. I can still point to cases that don't work as
well, but those are separate issues.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to POSIX, tv_sec is supposed to be a time_t, but OpenBSD
(and FreeBSD, too) defines it to be a long, which triggers a type
mismatch when a pointer to it is given to localtime_r().
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, for dates older than 12 months we fell back to just giving the
absolute time. This can be a bit jarring when reading a list of times.
Instead, let's switch to "Y years, M months" for five years, and then just
"Y years" after that.
No particular reason on the 5 year cutoff except that it seemed reasonable
to me.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Talking about --date, one thing I wanted for the 1234567890 date was to
get things in the raw format. Sure, you get them with --pretty=raw, but it
felt a bit sad that you couldn't just ask for the date in raw format.
So here's a throw-away patch (meaning: I won't be re-sending it, because I
really don't think it's a big deal) to add "--date=raw". It just prints
out the internal raw git format - seconds since epoch plus timezone (put
another way: 'date +"%s %z"' format)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The date/time parsing code was confused if the input time HH:MM:SS is
followed by fractional seconds. Since we do not record anything finer
grained than seconds, we could just drop fractional part, but there is a
twist.
We have taught people that not just spaces but dot can be used as word
separators when spelling things like:
$ git log --since 2.days
$ git show @{12:34:56.7.days.ago}
and we shouldn't mistake "7" in the latter example as a fraction and
discard it.
The rules are:
- valid days of month/mday are always single or double digits.
- valid years are either two or four digits
No, we don't support the year 600 _anyway_, since our encoding is based
on the UNIX epoch, and the day we worry about the year 10,000 is far
away and we can raise the limit to five digits when we get closer.
- Other numbers (eg "600 days ago") can have any number of digits, but
they cannot start with a zero. Again, the only exception is for
two-digit numbers, since that is fairly common for dates ("Dec 01" is
not unheard of)
So that means that any milli- or micro-second would be thrown out just
because the number of digits shows that it cannot be an interesting date.
A milli- or micro-second can obviously be a perfectly fine number
according to the rules above, as long as it doesn't start with a '0'. So
if we have
12:34:56.123
then that '123' gets parsed as a number, and we remember it. But because
it's bigger than 31, we'll never use it as such _unless_ there is
something after it to trigger that use.
So you can say "12:34:56.123.days.ago", and because of the "days", that
123 will actually be meaninful now.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit af66366a9f introduced the keyword
"never" to be used with approxidate() but defined it with a fixed date
without taking care of timezone. As a result approxidate() will return
a timestamp in the future with a negative timezone.
With this patch, approxidate("never") always return 0 whatever your
timezone is.
Signed-off-by: Olivier Marin <dkr@freesurf.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor out the code to parse --date=<format> parameter to revision
walkers into a separate function, parse_date_format(). This function
is passed a string and converts it to an enum date_format:
- "relative" => DATE_RELATIVE
- "iso8601" or "iso" => DATE_ISO8601
- "rfc2822" => DATE_RFC2822
- "short" => DATE_SHORT
- "local" => DATE_LOCAL
- "default" => DATE_NORMAL
In the event that none of these strings is found, the function die()s.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you want to keep the reflogs around for a really long time, you should be
able to say so:
$ git config gc.reflogExpire never
Now it works, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These days, show_date() takes a date_mode parameter to specify
the output format, and a separate specialized function for dates
in E-mails does not make much sense anymore.
This retires show_rfc2822_date() function and make it just
another date output format.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Support output of full ISO 8601 style dates in e.g. git log
and other places that use interpolation for formatting.
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests with git-filter-branch on a repository that was converted from
CVS and that has commits reaching back to 1999 revealed that it is
necessary to parse dates before 2000/01/01 when they are specified
as seconds since 1970/01/01. There is now still a limit, 100000000,
which is 1973/03/03 09:46:40 UTC, in order to allow that dates are
represented as 8 digits.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>