Commit Graph

92 Commits

Author SHA1 Message Date
René Scharfe
2947a7930d archive: add --add-file
Allow users to append non-tracked files.  This simplifies the generation
of source packages with a few extra files, e.g. containing version
information.  They get the same access times and user information as
tracked files.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-19 15:56:06 -07:00
brian m. carlson
0b78a1b22a t5000: make hash independent
This test uses a stub of a very large (64 GB) object to test our
generation of tar archives.  In doing so, it uses the object ID of the
object so it can insert it into the database properly.  Look up these
values using test_oid.  Restructure the test slightly to use
test_oid_in_path.

Since we care about the object, not how it is named in a particular hash
algorithm, rename it to "huge-object", which is shorter and more
descriptive.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-01 13:28:18 -07:00
Josh Steadmon
00436bf1b1 archive: initialize archivers earlier
Initialize archivers as soon as possible when running git-archive.
Various non-obvious behavior depends on having the archivers
initialized, such as determining the desired archival format from the
provided filename.

Since 08716b3c11 ("archive: refactor file extension format-guessing",
2011-06-21), archive_format_from_filename() has used the registered
archivers to match filenames (provided via --output) to archival
formats. However, when git-archive is executed with --remote, format
detection happens before the archivers have been registered. This causes
archives from remotes to always be generated as TAR files, regardless of
the actual filename (unless an explicit --format is provided).

This patch fixes that behavior; archival format is determined properly
from the output filename, even when --remote is used.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-26 10:17:59 +09:00
Junio C Hamano
deb9845a0a Merge branch 'ps/test-chmtime-get'
Test cleanup.

* ps/test-chmtime-get:
  t/helper: 'test-chmtime (--get|-g)' to print only the mtime
2018-04-25 13:29:00 +09:00
Paul-Sebastian Ungureanu
decf711fc1 t/helper: 'test-chmtime (--get|-g)' to print only the mtime
Compared to 'test-chmtime -v +0 file' which prints the mtime and
and the file name, 'test-chmtime --get file' displays only the mtime.
If it is used in combination with (+|=|=+|=-|-)seconds, it changes
and prints the new value.

	test-chmtime -v +0 file | sed 's/[^0-9].*$//'

is now equivalent to:

	test-chmtime --get file

Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-09 11:33:19 +09:00
Nguyễn Thái Ngọc Duy
c680668d1a t/helper: merge test-genrandom into test-tool
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27 08:45:47 -07:00
Nguyễn Thái Ngọc Duy
0e496492d2 t/helper: merge test-chmtime into test-tool
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27 08:45:47 -07:00
Johannes Schindelin
efac8ac84b t0006 & t5000: skip "far in the future" test when time_t is too limited
Git's source code refers to timestamps as unsigned long, which is
ill-defined, as there is no guarantee about the number of bits that
data type has.

In preparation of switching to another data type that is large enough
to hold "far in the future" dates, we need to prepare the t0006-date.sh
script for the case where we *still* cannot format those dates if the
system library uses 32-bit time_t.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-20 22:07:15 -07:00
Johannes Schindelin
a07fb0507f t0006 & t5000: prepare for 64-bit timestamps
Git's source code refers to timestamps as unsigned longs. On 32-bit
platforms, as well as on Windows, unsigned long is not large enough to
capture dates that are "absurdly far in the future".

It is perfectly valid by the C standard, of course, for the `long` data
type to refer to 32-bit integers. That is why the `time_t` data type
exists: so that it can be 64-bit even if `long` is 32-bit. Git's source
code simply uses an incorrect data type for timestamps, is all.

The earlier quick fix 6b9c38e14c (t0006: skip "far in the future" test
when unsigned long is not long enough, 2016-07-11) papered over this
issue simply by skipping the respective test cases on platforms where
they would fail due to the data type in use.

This quick fix, however, tests for *long* to be 64-bit or not. What we
need, though, is a test that says whether *whatever data type we use for
timestamps* is 64-bit or not.

The same quick fix was used to handle the similar problem where Git's
source code uses `unsigned long` to represent size, instead of `size_t`,
conflating the two issues.

So let's just add another prerequisite to test specifically whether
timestamps are represented by a 64-bit data type or not. Later, after we
switch to a larger data type, we can flip that prerequisite to test
`time_t` instead of `long`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-20 22:07:15 -07:00
Jeff King
de95302a4c t5000: extract nongit function to test-lib-functions.sh
This function abstracts the idea of running a command
outside of any repository (which is slightly awkward to do
because even if you make a non-repo directory, git may keep
walking up outside of the trash directory). There are
several scripts that use the same technique, so let's make
the function available for everyone.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-16 09:29:16 -08:00
Junio C Hamano
eb0224c617 archive: read local configuration
Since b9605bc4f2 ("config: only read .git/config from configured
repos", 2016-09-12), we do not read from ".git/config" unless we
know we are in a repository.  "git archive" however didn't do the
repository discovery and instead relied on the old behaviour.

Teach the command to run a "gentle" version of repository discovery
so that local configuration variables are honoured.

[jc: stole tests from peff]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-22 13:55:20 -08:00
Junio C Hamano
29493589e9 archive-tar: huge offset and future timestamps would not work on 32-bit
As we are not yet moving everything to size_t but still using ulong
internally when talking about the size of object, platforms with
32-bit long will not be able to produce tar archive with 4GB+ file,
and cannot grok 077777777777UL as a constant.  Disable the extended
header feature and do not test it on them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-15 10:51:55 -07:00
Jeff King
6e8e0991e5 archive-tar: write extended headers for far-future mtime
The ustar format represents timestamps as seconds since the
epoch, but only has room to store 11 octal digits.  To
express anything larger, we need to use an extended header.
This is exactly the same case we fixed for the size field in
the previous commit, and the solution here follows the same
pattern.

This is even mentioned as an issue in f2f0267 (archive-tar:
use xsnprintf for trivial formatting, 2015-09-24), but since
it only affected things far in the future, it wasn't deemed
worth dealing with. But note that my calculations claiming
thousands of years were off there; because our xsnprintf
produces a NUL byte, we only have until the year 2242 to fix
this.

Given that this is just around the corner (geologically
speaking, anyway), and because it's easy to fix, let's just
make it work. Unlike the previous fix for "size", where we
had to write an individual extended header for each file, we
can write one global header (since we have only one mtime
for the whole archive).

There's a slight bit of trickiness there. We may already be
writing a global header with a "comment" field for the
commit sha1. So we need to write our new field into the same
header. To do this, we push the decision of whether to write
such a header down into write_global_extended_header(),
which will now assemble the header as it sees fit, and will
return early if we have nothing to write (in practice, we'll
only have a large mtime if it comes from a commit, but this
makes it also work if you set your system clock ahead such
that time() returns a huge value).

Note that we don't (and never did) handle negative
timestamps (i.e., before 1970). This would probably not be
too hard to support in the same way, but since git does not
support negative timestamps at all, I didn't bother here.

After writing the extended header, we munge the timestamp in
the ustar headers to the maximum-allowable size. This is
wrong, but it's the least-wrong thing we can provide to a
tar implementation that doesn't understand pax headers (it's
also what GNU tar does).

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 10:26:01 -07:00
Jeff King
d1657b570a archive-tar: write extended headers for file sizes >= 8GB
The ustar format has a fixed-length field for the size of
each file entry which is supposed to contain up to 11 bytes
of octal-formatted data plus a NUL or space terminator.

These means that the largest size we can represent is
077777777777, or 1 byte short of 8GB. The correct solution
for a larger file, according to POSIX.1-2001, is to add an
extended pax header, similar to how we handle long
filenames. This patch does that, and writes zero for the
size field in the ustar header (the last bit is not
mentioned by POSIX, but it matches how GNU tar behaves with
--format=pax).

This should be a strict improvement over the current
behavior, which is to die in xsnprintf with a "BUG".
However, there's some interesting history here.

Prior to f2f0267 (archive-tar: use xsnprintf for trivial
formatting, 2015-09-24), we silently overflowed the "size"
field. The extra bytes ended up in the "mtime" field of the
header, which was then immediately written itself,
overwriting our extra bytes. What that means depends on how
many bytes we wrote.

If the size was 64GB or greater, then we actually overflowed
digits into the mtime field, meaning our value was
effectively right-shifted by those lost octal digits. And
this patch is again a strict improvement over that.

But if the size was between 8GB and 64GB, then our 12-byte
field held all of the actual digits, and only our NUL
terminator overflowed. According to POSIX, there should be a
NUL or space at the end of the field. However, GNU tar seems
to be lenient here, and will correctly parse a size up 64GB
(minus one) from the field. So sizes in this range might
have just worked, depending on the implementation reading
the tarfile.

This patch is mostly still an improvement there, as the 8GB
limit is specifically mentioned in POSIX as the correct
limit. But it's possible that it could be a regression
(versus the pre-f2f0267 state) if all of the following are
true:

  1. You have a file between 8GB and 64GB.

  2. Your tar implementation _doesn't_ know about pax
     extended headers.

  3. Your tar implementation _does_ parse 12-byte sizes from
     the ustar header without a delimiter.

It's probably not worth worrying about such an obscure set
of conditions, but I'm documenting it here just in case.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 10:25:46 -07:00
Jeff King
e51217e15c t5000: test tar files that overflow ustar headers
The ustar format only has room for 11 (or 12, depending on
some implementations) octal digits for the size and mtime of
each file. For values larger than this, we have to add pax
extended headers to specify the real data, and git does not
yet know how to do so.

Before fixing that, let's start off with some test
infrastructure, as designing portable and efficient tests
for this is non-trivial.

We want to use the system tar to check our output (because
what we really care about is interoperability), but we can't
rely on it:

  1. being able to read pax headers

  2. being able to handle huge sizes or mtimes

  3. supporting a "t" format we can parse

So as a prerequisite, we can feed the system tar a reference
tarball to make sure it can handle these features. The
reference tar here was created with:

  dd if=/dev/zero seek=64G bs=1 count=1 of=huge
  touch -d @68719476737 huge
  tar cf - --format=pax |
  head -c 2048

using GNU tar. Note that this is not a complete tarfile, but
it's enough to contain the headers we want to examine.

Likewise, we need to convince git that it has a 64GB blob to
output. Running "git add" on that 64GB file takes many
minutes of CPU, and even compressed, the result is 64MB. So
again, I pre-generated that loose object, and then took only
the first 2k of it. That should be enough to generate 2MB of
data before hitting an inflate error, which is plenty for us
to generate the tar header (and then die of SIGPIPE while
streaming the rest out).

The tests are split so that we test as much as we can even
with an uncooperative system tar. This actually catches the
current breakage (which is that we die("BUG") trying to
write the ustar header) on every system, and then on systems
where we can, we go farther and actually verify the result.

Helped-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 10:24:18 -07:00
Junio C Hamano
4762c7b42a Merge branch 'js/t5000-dont-copy-bin-sh'
* js/t5000-dont-copy-bin-sh:
  t5000 on Windows: do not mistake "sh.exe" as "sh"
2014-12-22 12:26:43 -08:00
Johannes Sixt
bba5fccc03 t5000 on Windows: do not mistake "sh.exe" as "sh"
In their effort to emulate POSIX as close as possible, the MSYS tools
and Cygwin treat the file name "foo.exe" as "foo" when the latter is
asked for, but not present, but the former is present.

Following this rule, 'cp /bin/sh a/bin' actually copies the file
/bin/sh.exe, so that we now have a/bin/sh.exe in the repository. This
difference did not matter in the tests in the past because we were only
interested in the equality of contents generated in various ways. But
recently added tests check file names, in particular, the presence of
"a/bin/sh". This test fails on Windows, as we do not have a file by this
name, but "a/bin/sh.exe".

Use test-genrandom to generate the large binary file in the repository
under the expected name.

We could change the guilty line to 'cat /bin/sh >a/bin/sh', but it is
better for test reproducibility to ensure that the test data is the same
across platforms, which test-genrandom can guarantee.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-24 11:34:32 -08:00
Junio C Hamano
b2c45f5b96 Merge branch 'nd/archive-pathspec'
"git archive" learned to filter what gets archived with pathspec.

* nd/archive-pathspec:
  archive: support filtering paths with glob
2014-10-08 13:05:26 -07:00
Nguyễn Thái Ngọc Duy
ed22b4173b archive: support filtering paths with glob
This patch fixes two problems with using :(glob) (or even "*.c"
without ":(glob)").

The first one is we forgot to turn on the 'recursive' flag in struct
pathspec. Without that, tree_entry_interesting() will not mark
potential directories "interesting" so that it can confirm whether
those directories have anything matching the pathspec.

The marking directories interesting has a side effect that we need to
walk inside a directory to realize that there's nothing interested in
there. By that time, 'archive' code has already written the (empty)
directory down. That means lots of empty directories in the result
archive.

This problem is fixed by lazily writing directories down when we know
they are actually needed. There is a theoretical bug in this
implementation: we can't write empty trees/directories that match that
pathspec.

path_exists() is also made stricter in order to detect non-matching
pathspec because when this 'recursive' flag is on, we most likely
match some directories. The easiest way is not consider any
directories "matched".

Noticed-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-22 12:04:29 -07:00
René Scharfe
21711ca4b2 t5000, t5003: simplify commit
Add the whole directory of test files at once using git add instead of
calling git update-index on each of them and use git commit instead of
the plumbing commands write-tree, update-ref and commit-tree to build
the commit.  This simplifies the code considerably.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-07 14:10:13 -07:00
Junio C Hamano
e56857246a Merge branch 'ep/avoid-test-a-o'
Update tests and scripts to avoid "test ... -a ...", which is often
more error-prone than "test ... && test ...".

Squashed misconversion fix-up into git-submodule.sh updates.

* ep/avoid-test-a-o:
  git-submodule.sh: avoid "echo" path-like values
  git-submodule.sh: avoid "test <cond> -a/-o <cond>"
  t/test-lib-functions.sh: avoid "test <cond> -a/-o <cond>"
  t/t9814-git-p4-rename.sh: avoid "test <cond> -a/-o <cond>"
  t/t5538-push-shallow.sh: avoid "test <cond> -a/-o <cond>"
  t/t5403-post-checkout-hook.sh: avoid "test <cond> -a/-o <cond>"
  t/t5000-tar-tree.sh: avoid "test <cond> -a/-o <cond>"
  t/t4102-apply-rename.sh: avoid "test <cond> -a/-o <cond>"
  t/t0026-eol-config.sh: avoid "test <cond> -a/-o <cond>"
  t/t0025-crlf-auto.sh: avoid "test <cond> -a/-o <cond>"
  t/lib-httpd.sh: avoid "test <cond> -a/-o <cond>"
  git-rebase--interactive.sh: avoid "test <cond> -a/-o <cond>"
  git-mergetool.sh: avoid "test <cond> -a/-o <cond>"
  git-bisect.sh: avoid "test <cond> -a/-o <cond>"
  contrib/examples/git-resolve.sh: avoid "test <cond> -a/-o <cond>"
  contrib/examples/git-repack.sh: avoid "test <cond> -a/-o <cond>"
  contrib/examples/git-merge.sh: avoid "test <cond> -a/-o <cond>"
  contrib/examples/git-commit.sh: avoid "test <cond> -a/-o <cond>"
  contrib/examples/git-clone.sh: avoid "test <cond> -a/-o <cond>"
  check_bindir: avoid "test <cond> -a/-o <cond>"
2014-06-25 12:23:56 -07:00
Junio C Hamano
c651ccc91d Merge branch 'sk/test-cmp-bin'
* sk/test-cmp-bin:
  t5000, t5003: do not use test_cmp to compare binary files
2014-06-16 12:18:51 -07:00
Elia Pinto
d0b30a3d4d t/t5000-tar-tree.sh: avoid "test <cond> -a/-o <cond>"
The construct is error-prone; "test" being built-in in most modern
shells, the reason to avoid "test <cond> && test <cond>" spawning
one extra process by using a single "test <cond> -a <cond>" no
longer exists.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-09 15:53:41 -07:00
Stepan Kasal
b93e6e3663 t5000, t5003: do not use test_cmp to compare binary files
test_cmp() is primarily meant to compare text files (and display the
difference for debug purposes).

Raw "cmp" is better suited to compare binary files (tar, zip, etc.).

On MinGW, test_cmp is a shell function mingw_test_cmp that tries to
read both files into environment, stripping CR characters (introduced
in commit 4d715ac0).

This function usually speeds things up, as fork is extremly slow on
Windows.  But no wonder that this function is extremely slow and
sometimes even crashes when comparing large tar or zip files.

Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-04 11:14:25 -07:00
Elia Pinto
f5efd5196c t5000-tar-tree.sh: use the $( ... ) construct for command substitution
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.

The backquoted form is the traditional method for command
substitution, and is supported by POSIX.  However, all but the
simplest uses become complicated quickly.  In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.

The patch was generated by:

for _f in $(find . -name "*.sh")
do
   sed -i 's@`\(.*\)`@$(\1)@g' ${_f}
done

and then carefully proof-read.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-30 11:08:10 -07:00
Scott J. Goldman
7671b63211 add uploadarchive.allowUnreachable option
In commit ee27ca4, we started restricting remote git-archive
invocations to only accessing reachable commits. This
matches what upload-pack allows, but does restrict some
useful cases (e.g., HEAD:foo). We loosened this in 0f544ee,
which allows `foo:bar` as long as `foo` is a ref tip.
However, that still doesn't allow many useful things, like:

  1. Commits accessible from a ref, like `foo^:bar`, which
     are reachable

  2. Arbitrary sha1s, even if they are reachable.

We can do a full object-reachability check for these cases,
but it can be quite expensive if the client has sent us the
sha1 of a tree; we have to visit every sub-tree of every
commit in the worst case.

Let's instead give site admins an escape hatch, in case they
prefer the more liberal behavior.  For many sites, the full
object database is public anyway (e.g., if you allow dumb
walker access), or the site admin may simply decide the
security/convenience tradeoff is not worth it.

This patch adds a new config option to disable the
restrictions added in ee27ca4. It defaults to off, meaning
there is no change in behavior by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-28 09:55:37 -08:00
Junio C Hamano
aa13132d90 Merge branch 'jk/t5000-gzip-simplify'
Test fix.

* jk/t5000-gzip-simplify:
  t5000: simplify gzip prerequisite checks
2013-12-17 11:46:30 -08:00
Jeff King
96174145fc t5000: simplify gzip prerequisite checks
In t5000, we test the built-in ".tar.gz" config for
git-archive. To make our tests portable, we check that we
have a way to both gzip and gunzip, and we respected
environment variables to point to alternate commands for
doing these operations.

However, the $GZIP variable did not actually do anything, as
changing it would not affect the baked-in value in
archive-tar.c. Moreover, setting the variable $GZIP
influences gzip itself. From the gzip man page:

  The environment variable GZIP can hold a set of default
  options for gzip. These options are interpreted first and
  can be overwritten by explicit command line parameters.

We could rename this variable, and use it to set up custom
config (or even have a Makefile knob to affect the built
binary), but it is not worth the trouble; nobody has ever
reported a problem with the baked-in default, and they can
always change it via config if they need to. Let's just drop
the variable and use "gzip" in the test (keeping the
prerequisite, of course).

While we're at it, we can drop the GUNZIP variable and
prerequisite; it uses "gzip -d", so if we have GZIP, we
will have both.

We can also use test_lazy_prereq for the gzip prerequisite,
which is simpler and behaves more consistently with the rest
of git (e.g., by making output available when the test is
run with "-v").

Noticed-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-04 16:11:53 -08:00
John Keeping
925ceccf05 tar-tree: remove deprecated command
"git tar-tree" has been a thin wrapper around "git archive" since commit
fd88d9c (Remove upload-tar and make git-tar-tree a thin wrapper to
git-archive, 2006-09-24), which also made it print a message indicating
that git-tar-tree is deprecated.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-12 14:10:19 -08:00
René Scharfe
9bf1ac41d2 t5000: test long filenames
Add a file with a long name to the test archive in order to check
entries with pax extended headers.  Also add a check for tar versions
that doen't understand this format.  Those versions should extract the
headers as a regular files.  Add code to check_tar() to interpret the
path header if present, so that our tests work even with those tar
versions.

It's important to use the fallback code only if needed to still be
able to detect git archive errorously creating pax headers as regular
file entries (with a suitable tar version, of course).

The archive used to check for pax header support in tar was generated
using GNU tar 1.26 and its option --format=pax.

Tested successfully on NetBSD 6.1, which has a tar version lacking pax
header support.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-20 15:31:46 -07:00
René Scharfe
0a00ee5844 t5000: simplify tar-tree tests
Just compare the archives created by git tar-tree with the ones created
using git archive with the equivalent options, whose contents are
checked already, instead of extracting them again.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-20 15:31:45 -07:00
René Scharfe
03d9bc564b t5000: use check_tar for prefix test
Perform the full range of checks against all archived files instead of
looking only at the file type of a few of them.  Also add a test of a
git archive with a prefix ending in with a slash, i.e. adding a full
directory level.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-20 15:31:45 -07:00
René Scharfe
deb9c8ed85 t5000: factor out check_tar
Create a helper function that extracts a tar archive and checks its
contents, modelled after check_zip in t5003.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-20 15:31:45 -07:00
René Scharfe
1355241bf5 t5000, t5003: create directories for extracted files lazily
Create the directories b and c just before they are needed instead of
up front.  For t5003 it turns out we don't need them at all.  For t5000
it makes the coming modifications easier.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-20 15:31:45 -07:00
René Scharfe
c420df7b9b t5000: integrate export-subst tests into regular tests
Instead of creating extra archives for testing substitutions, set the
attribute export-subst and overwrite the marked file with the expected
(expanded) content right between committing and archiving.  Thus
placeholder expansion based on the committed content is performed with
each archive creation and the comparison with the contents of directory
a yields the correct result.  We can then remove the special tests for
export-subst.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-20 15:31:10 -07:00
Jeff King
785a042981 archive-tar: use parse_config_key when parsing config
This is fewer lines of code, but more importantly, fixes a
bogus pointer offset. We are looking for "tar." in the
section, but later assume that the dot we found is at offset
9, not 3. This is a holdover from an earlier iteration of
767cf45 which called the section "tarfilter".

As a result, we could erroneously reject some filters with
dots in their name, as well as read uninitialized memory.

Reported by (and test by) René Scharfe.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-23 08:41:50 -08:00
René Scharfe
e9882c80cd t5000, t5003: move ZIP tests into their own script
This makes ZIP specific tweaks easier.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-07 08:47:55 -08:00
René Scharfe
25d3d32363 t0024, t5000: use test_lazy_prereq for UNZIP
This change makes the code smaller and we can put it at the top of
the script, its rightful place as setup code.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-07 08:47:51 -08:00
René Scharfe
ac00128298 t0024, t5000: clear variable UNZIP, use GIT_UNZIP instead
InfoZIP's unzip takes default parameters from the environment variable
UNZIP.  Unset it in the test library and use GIT_UNZIP for specifying
alternate versions of the unzip command instead.

t0024 wasn't even using variable for the actual extraction.  t5000
was, but when setting it to InfoZIP's unzip it would try to extract
from itself (because it treats the contents of $UNZIP as parameters),
which failed of course.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 23:37:40 -08:00
René Scharfe
2dd42334de t5000: rationalize unzip tests
Factor out a function for checking the contents of ZIP archives.  It
extracts their contents and compares them to the original files.  This
removes some duplicate code.  Tests that just create archives can lose
their UNZIP prerequisite.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:57 -07:00
René Scharfe
c743c21591 archive-zip: streaming for deflated files
After an entry has been streamed out, its CRC and sizes are written as
part of a data descriptor.

For simplicity, we make the buffer for the compressed chunks twice as
big as for the uncompressed ones, to be sure the result fit in even
if deflate makes them bigger.

t5000 verifies output. t1050 makes sure the command always respects
core.bigfilethreshold

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:57 -07:00
René Scharfe
2158f883d9 archive-zip: streaming for stored files
Write a data descriptor containing the CRC of the entry and its sizes
after streaming it out.  For simplicity, do that only if we're storing
files (option -0) for now.

t5000 verifies output. t1050 makes sure the command always respects
core.bigfilethreshold

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:57 -07:00
Nguyễn Thái Ngọc Duy
5544049def archive-tar: stream large blobs to tar file
t5000 verifies output while t1050 makes sure the command always
respects core.bigfilethreshold

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:56 -07:00
Junio C Hamano
0bbaa5c076 Merge branch 'jk/upload-archive-use-start-command'
* jk/upload-archive-use-start-command:
  upload-archive: use start_command instead of fork
2011-12-16 22:33:30 -08:00
Junio C Hamano
7b51c33b37 Merge branch 'jk/maint-1.6.2-upload-archive' into jk/maint-upload-archive
* jk/maint-1.6.2-upload-archive:
  archive: don't let remote clients get unreachable commits

Conflicts:
	archive.c
	archive.h
	builtin-archive.c
	builtin/upload-archive.c
	t/t5000-tar-tree.sh
2011-11-21 15:04:11 -08:00
Jeff King
ee27ca4a78 archive: don't let remote clients get unreachable commits
Usually git is careful not to allow clients to fetch
arbitrary objects from the database; for example, objects
received via upload-pack must be reachable from a ref.
Upload-archive breaks this by feeding the client's tree-ish
directly to get_sha1, which will accept arbitrary hex sha1s,
reflogs, etc.

This is not a problem if all of your objects are publicly
reachable anyway (or at least public to anybody who can run
upload-archive). Or if you are making the repo available by
dumb protocols like http or rsync (in which case the client
can read your whole object db directly).

But for sites which allow access only through smart
protocols, clients may be able to fetch trees from commits
that exist in the server's object database but are not
referenced (e.g., because history was rewound).

This patch tightens upload-archive's lookup to use dwim_ref
rather than get_sha1. This means a remote client can only
fetch the tip of a named ref, not an arbitrary sha1 or
reflog entry.

This also restricts some legitimate requests, too:

  1. Reachable non-tip commits, like:

        git archive --remote=$url v1.0~5

  2. Sub-trees of reachable commits, like:

        git archive --remote=$url v1.7.7:Documentation

Local requests continue to use get_sha1, and are not
restricted at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-11-21 14:42:25 -08:00
Jeff King
1bc01efed1 upload-archive: use start_command instead of fork
The POSIX-function fork is not supported on Windows. Use our
start_command API instead, respawning ourselves in a special
"writer" mode to follow the alternate code path.

Remove the NOT_MINGW-prereq for t5000, as git-archive --remote
now works.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-11-21 14:32:40 -08:00
Junio C Hamano
f0c7fd49c0 Revert "upload-archive: use start_command instead of fork"
This reverts commit c09cd77ea2, expecting a
better version to be rerolled soon.
2011-11-15 15:39:33 -08:00
Erik Faye-Lund
c09cd77ea2 upload-archive: use start_command instead of fork
The POSIX-function fork is not supported on Windows. Use our
start_command API instead.

As this is the last call-site that depends on the fork-stub in
compat/mingw.h, remove that as well.

Add an undocumented flag to git-archive that tells it that the
action originated from a remote, so features can be disabled.
Thanks to Jeff King for work on this part.

Remove the NOT_MINGW-prereq for t5000, as git-archive --remote
now works.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-30 18:45:21 -07:00
Johannes Sixt
1b57e56c61 Skip archive --remote tests on Windows
These depend on a working git-upload-archive, which is broken on Windows,
because it depends on fork().

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-03 10:16:20 -07:00