The ustar format represents timestamps as seconds since the
epoch, but only has room to store 11 octal digits. To
express anything larger, we need to use an extended header.
This is exactly the same case we fixed for the size field in
the previous commit, and the solution here follows the same
pattern.
This is even mentioned as an issue in f2f0267 (archive-tar:
use xsnprintf for trivial formatting, 2015-09-24), but since
it only affected things far in the future, it wasn't deemed
worth dealing with. But note that my calculations claiming
thousands of years were off there; because our xsnprintf
produces a NUL byte, we only have until the year 2242 to fix
this.
Given that this is just around the corner (geologically
speaking, anyway), and because it's easy to fix, let's just
make it work. Unlike the previous fix for "size", where we
had to write an individual extended header for each file, we
can write one global header (since we have only one mtime
for the whole archive).
There's a slight bit of trickiness there. We may already be
writing a global header with a "comment" field for the
commit sha1. So we need to write our new field into the same
header. To do this, we push the decision of whether to write
such a header down into write_global_extended_header(),
which will now assemble the header as it sees fit, and will
return early if we have nothing to write (in practice, we'll
only have a large mtime if it comes from a commit, but this
makes it also work if you set your system clock ahead such
that time() returns a huge value).
Note that we don't (and never did) handle negative
timestamps (i.e., before 1970). This would probably not be
too hard to support in the same way, but since git does not
support negative timestamps at all, I didn't bother here.
After writing the extended header, we munge the timestamp in
the ustar headers to the maximum-allowable size. This is
wrong, but it's the least-wrong thing we can provide to a
tar implementation that doesn't understand pax headers (it's
also what GNU tar does).
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ustar format has a fixed-length field for the size of
each file entry which is supposed to contain up to 11 bytes
of octal-formatted data plus a NUL or space terminator.
These means that the largest size we can represent is
077777777777, or 1 byte short of 8GB. The correct solution
for a larger file, according to POSIX.1-2001, is to add an
extended pax header, similar to how we handle long
filenames. This patch does that, and writes zero for the
size field in the ustar header (the last bit is not
mentioned by POSIX, but it matches how GNU tar behaves with
--format=pax).
This should be a strict improvement over the current
behavior, which is to die in xsnprintf with a "BUG".
However, there's some interesting history here.
Prior to f2f0267 (archive-tar: use xsnprintf for trivial
formatting, 2015-09-24), we silently overflowed the "size"
field. The extra bytes ended up in the "mtime" field of the
header, which was then immediately written itself,
overwriting our extra bytes. What that means depends on how
many bytes we wrote.
If the size was 64GB or greater, then we actually overflowed
digits into the mtime field, meaning our value was
effectively right-shifted by those lost octal digits. And
this patch is again a strict improvement over that.
But if the size was between 8GB and 64GB, then our 12-byte
field held all of the actual digits, and only our NUL
terminator overflowed. According to POSIX, there should be a
NUL or space at the end of the field. However, GNU tar seems
to be lenient here, and will correctly parse a size up 64GB
(minus one) from the field. So sizes in this range might
have just worked, depending on the implementation reading
the tarfile.
This patch is mostly still an improvement there, as the 8GB
limit is specifically mentioned in POSIX as the correct
limit. But it's possible that it could be a regression
(versus the pre-f2f0267 state) if all of the following are
true:
1. You have a file between 8GB and 64GB.
2. Your tar implementation _doesn't_ know about pax
extended headers.
3. Your tar implementation _does_ parse 12-byte sizes from
the ustar header without a delimiter.
It's probably not worth worrying about such an obscure set
of conditions, but I'm documenting it here just in case.
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ustar format only has room for 11 (or 12, depending on
some implementations) octal digits for the size and mtime of
each file. For values larger than this, we have to add pax
extended headers to specify the real data, and git does not
yet know how to do so.
Before fixing that, let's start off with some test
infrastructure, as designing portable and efficient tests
for this is non-trivial.
We want to use the system tar to check our output (because
what we really care about is interoperability), but we can't
rely on it:
1. being able to read pax headers
2. being able to handle huge sizes or mtimes
3. supporting a "t" format we can parse
So as a prerequisite, we can feed the system tar a reference
tarball to make sure it can handle these features. The
reference tar here was created with:
dd if=/dev/zero seek=64G bs=1 count=1 of=huge
touch -d @68719476737 huge
tar cf - --format=pax |
head -c 2048
using GNU tar. Note that this is not a complete tarfile, but
it's enough to contain the headers we want to examine.
Likewise, we need to convince git that it has a 64GB blob to
output. Running "git add" on that 64GB file takes many
minutes of CPU, and even compressed, the result is 64MB. So
again, I pre-generated that loose object, and then took only
the first 2k of it. That should be enough to generate 2MB of
data before hitting an inflate error, which is plenty for us
to generate the tar header (and then die of SIGPIPE while
streaming the rest out).
The tests are split so that we test as much as we can even
with an uncooperative system tar. This actually catches the
current breakage (which is that we die("BUG") trying to
write the ustar header) on every system, and then on systems
where we can, we go farther and actually verify the result.
Helped-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is sometimes useful to be able to read exactly N bytes from a
pipe. Doing this portably turns out to be surprisingly difficult
in shell scripts.
We want a solution that:
- is portable
- never reads more than N bytes due to buffering (which
would mean those bytes are not available to the next
program to read from the same pipe)
- handles partial reads by looping until N bytes are read
(or we see EOF)
- is resilient to stray signals giving us EINTR while
trying to read (even though we don't send them, things
like SIGWINCH could cause apparently-random failures)
Some possible solutions are:
- "head -c" is not portable, and implementations may
buffer (though GNU head does not)
- "read -N" is a bash-ism, and thus not portable
- "dd bs=$n count=1" does not handle partial reads. GNU dd
has iflags=fullblock, but that is not portable
- "dd bs=1 count=$n" fixes the partial read problem (all
reads are 1-byte, so there can be no partial response).
It does make a lot of write() calls, but for our tests
that's unlikely to matter. It's fairly portable. We
already use it in our tests, and it's unlikely that
implementations would screw up any of our criteria. The
most unknown one would be signal handling.
- perl can do a sysread() loop pretty easily. On my Linux
system, at least, it seems to restart the read() call
automatically. If that turns out not to be portable,
though, it would be easy for us to handle it.
That makes the perl solution the least bad (because we
conveniently omitted "length of code" as a criterion).
It's also what t9300 is currently using, so we can just pull
the implementation from there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While it is not recommended, fsck.c says:
Not having a body is not a crime [...]
... which means that we cannot assume that the commit buffer
contains an empty line to separate header from body. A commit
object with only a header without any body, not even without
a blank line after the header, is valid.
So let's tread carefully here. strstr("\n\n") may find nothing
and return NULL.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there are blank lines at the beginning of a commit message, the
pretty printing machinery already skips them when showing a commit
subject (or the complete commit message). We shall henceforth do the
same when reporting the commit subject after the user called
git reset --hard <commit>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like we already taught the find_commit_subject() function (to make
it consistent with the code in pretty.c), we now simply skip leading
blank lines of the commit message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Consistent with the pretty-printing machinery, we skip leading blank
lines (if any) of existing commit messages.
While Git itself only produces commit objects with a single empty line
between commit header and commit message, it is legal to have more than
one blank line (i.e. lines containing only white space, or no
characters) at the beginning of the commit message, and the
pretty-printing code already handles that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we abort an interactive rebase we do so by calling
`die_abort`, which cleans up after us by removing the rebase
state directory. If the user has requested to use the autostash
feature, though, the state directory may also contain a reference
to the autostash, which will now be deleted.
Fix the issue by trying to re-apply the autostash in `die_abort`.
This will also handle the case where the autostash does not apply
cleanly anymore by recording it in a user-visible stash.
Reported-by: Daniel Hahler <git@thequod.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callers of this function use struct object_id, so convert it
to use struct object_id in its arguments and internally.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this function and the git merge-recursive subcommand to use
struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all but two of the static functions in this file to use struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct merge_file_info to use struct object_id. The following
Coccinelle semantic patch was used to implement this, followed by the
transformations in object_id.cocci:
@@
struct merge_file_info o;
@@
- o.sha
+ o.oid.hash
@@
struct merge_file_info *p;
@@
- p->sha
+ p->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the anonymous struct within struct stage_data to use struct
object_id. The following Coccinelle semantic patch was used to
implement this, followed by the transformations in object_id.cocci:
@@
struct stage_data o;
expression E1;
@@
- o.stages[E1].sha
+ o.stages[E1].oid.hash
@@
struct stage_data *p;
expression E1;
@@
- p->stages[E1].sha
+ p->stages[E1].oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that this struct's sha1 member is called "oid", update the comment
and the sha1_valid member to be called "oid_valid" instead. The
following Coccinelle semantic patch was used to implement this, followed
by the transformations in object_id.cocci:
@@
struct diff_filespec o;
@@
- o.sha1_valid
+ o.oid_valid
@@
struct diff_filespec *p;
@@
- p->sha1_valid
+ p->oid_valid
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct diff_filespec's sha1 member to use a struct object_id
called "oid" instead. The following Coccinelle semantic patch was used
to implement this, followed by the transformations in object_id.cocci:
@@
struct diff_filespec o;
@@
- o.sha1
+ o.oid.hash
@@
struct diff_filespec *p;
@@
- p->sha1
+ p->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the set of semantic patches from contrib/coccinelle to convert
some leftover places using struct object_id's hash member to instead
use the wrapper functions that take struct object_id natively.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hashcpy with null_sha1 as the source is equivalent to hashclr. In
addition to being simpler, using hashclr may give the compiler a chance
to optimize better. Convert instances of hashcpy with the source
argument of null_sha1 to hashclr.
This transformation was implemented using the following semantic patch:
@@
expression E1;
@@
-hashcpy(E1, null_sha1);
+hashclr(E1);
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Coccinelle (http://coccinelle.lip6.fr/) is a program which performs
mechanical transformations on C programs using semantic patches. These
semantic patches can be used to implement automatic refactoring and
maintenance tasks.
Add a set of basic semantic patches to convert common patterns related
to the struct object_id transformation, as well as a README.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function works just like sha1_to_hex_r, except that it takes a
pointer to struct object_id instead of a pointer to unsigned char.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --output=<file> --color=auto" used to show the ANSI color
sequence in the resulting file when the standard output is connected
to a terminal, because --color=auto check always checks the standard
output, not the actual file that receives the output.
We could correct this by using freopen(3) to redirect the standard
output to the specified file, which is in like with how format-patch
used to match the world order, but following the same reasoning as
the earlier "format-patch: explicitly switch off color when writing
to files", let's be more strict by bypassing the "auto" check when
the --output=<file> option is in use.
Strictly speaking, this is a backwards-incompatible change, but
it is highly unlikely that any user would want to see ANSI color
sequences in a file.
The reason this was not caught earlier is most likely that either
--output=<file> is not used, or only when stdout is redirected
anyway.
Users can still give --color=always if they want a colored diff in
the resulting file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test merges an external tree in as a subtree, makes some commits
on top of it and splits it back out. In the process the added commits
are lost or the rebase aborts with an internal error. The tests are
marked to expect failure so that we don't forget to fix it.
Signed-off-by: David A. Greene <greened@obbligato.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, ANSI color sequences were supported on Windows only by
overriding the printf() and fprintf() functions, as mentioned in e7821d7
(Add a notice that only certain functions can print color escape codes,
2009-11-27).
As of eac14f8 (Win32: Thread-safe windows console output, 2012-01-14),
however, this is no longer the case, as the ANSI color sequence support
code needed to be replaced with a thread-safe version, one side effect
being that stdout and stderr handled no matter which function is used to
write to it.
So let's just remove the comment that is now obsolete.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is an application of the newly added CodingGuidelines to HEAD and
variants like FETCH_HEAD. It was obtained with:
perl -pi -e "s/'([A-Z_]*HEAD)'/\`\$1\`/g" *.txt
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current practice is:
git/Documentation$ git grep "'HEAD'" | wc -l
24
git/Documentation$ git grep "\`HEAD\`" | wc -l
66
Let's adopt the majority as a guideline.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was obtained with:
perl -pi -e "s/'--'/\`--\`/g" *.txt
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similarly to the previous commit, use backquotes instead of
forward-quotes, for long options.
This was obtained with:
perl -pi -e "s/'(--[a-z][a-z=<>-]*)'/\`\$1\`/g" *.txt
and manual tweak to remove false positive in ascii-art (o'--o'--o' to
describe rewritten history).
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was common in our documentation to surround short option names with
forward quotes, which renders as italic in HTML. Instead, use backquotes
which renders as monospace. This is one more step toward conformance to
Documentation/CodingGuidelines.
This was obtained with:
perl -pi -e "s/'(-[a-z])'/\`\$1\`/g" *.txt
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace spaces with tabs to avoid a warning when further patches change
these lines.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the fetch flag code consistent with push, where '-' means
deleted ref.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it easier to change the formatting later. And it makes sure
translators cannot mess up format specifiers and break Git.
There are a couple call sites where the length of the second column is
TRANSPORT_SUMMARY_WIDTH instead of calculated by TRANSPORT_SUMMARY(),
which is enforced now. The result should be the same because these call
sites do not contain characters outside ASCII range.
The two strbuf_addf() calls instead of one is mostly to reduce
diff-noise in a future patch where "ref -> ref" is reformatted
differently.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This documents the ref update status of fetch. The structure of this
output is defined in [1]. The ouput content is refined a bit in [2]
[3] [4].
This patch is a copy from git-push.txt, modified a bit because the
flag '-' means different things in push (delete) and fetch (tag
update).
PS. For code archaeologists, the discussion mentioned in [1] is
probably [5].
[1] 165f390 (git-fetch: more terse fetch output - 2007-11-03)
[2] 6315472 (fetch: report local storage errors ... - 2008-06-26)
[3] f360d84 (builtin-fetch: add --prune option - 2009-11-10)
[4] 0997ada (fetch: describe new refs based on where... - 2012-04-16)
[5] http://thread.gmane.org/gmane.comp.version-control.git/61657
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The completion script (in contrib/) learned to complete "git
status" options.
* tb/complete-status:
completion: add git status
completion: add __git_get_option_value helper
completion: factor out untracked file modes into a variable
"git cherry-pick A" worked on an unborn branch, but "git
cherry-pick A..B" didn't.
* mg/cherry-pick-multi-on-unborn:
cherry-pick: allow to pick to unborn branches
Allow messages that are generated by auto gc during "git push" on
the receiving end to be explicitly passed back to the sending end
over sideband, so that they are shown with "remote: " prefix to
avoid confusing the users.
* lf/receive-pack-auto-gc-to-client:
receive-pack: send auto-gc output over sideband 2
Comments about misbehaving FreeBSD shells have been clarified with
the version number (9.x and before are broken, newer ones are OK).
* em/newer-freebsd-shells-are-fine-with-returns:
rebase: update comment about FreeBSD /bin/sh
"git status" used to say "working directory" when it meant "working
tree".
* lv/status-say-working-tree-not-directory:
Use "working tree" instead of "working directory" for git status
"git update-index --add --chmod=+x file" may be usable as an escape
hatch, but not a friendly thing to force for people who do need to
use it regularly. "git add --chmod=+x file" can be used instead.
* et/add-chmod-x:
add: add --chmod=+x / --chmod=-x options
The git-prompt scriptlet (in contrib/) was not friendly with those
who uses "set -u", which has been fixed.
* vs/prompt-avoid-unset-variable:
git-prompt.sh: Don't error on null ${ZSH,BASH}_VERSION, $short_sha
"git reflog" stopped upon seeing an entry that denotes a branch
creation event (aka "unborn"), which made it appear as if the
reflog was truncated.
* sg/reflog-past-root:
reflog: continue walking the reflog past root commits