Commit Graph

711 Commits

Author SHA1 Message Date
Jeff King
45014beac0 Merge branch 'dk/gc-idx-wo-pack'
Having a leftover .idx file without corresponding .pack file in
the repository hurts performance; "git gc" learned to prune them.

* dk/gc-idx-wo-pack:
  gc: remove garbage .idx files from pack dir
  t5304: test cleaning pack garbage
  prepare_packed_git(): refactor garbage reporting in pack directory
2015-11-20 06:55:34 -05:00
Junio C Hamano
808d119263 Merge branch 'js/misc-fixes'
Various compilation fixes and squelching of warnings.

* js/misc-fixes:
  Correct fscanf formatting string for I64u values
  Silence GCC's "cast of pointer to integer of a different size" warning
  Squelch warning about an integer overflow
2015-10-30 13:07:00 -07:00
Johannes Schindelin
56a1a3ab44 Silence GCC's "cast of pointer to integer of a different size" warning
When calculating hashes from pointers, it actually makes sense to cut
off the most significant bits. In that case, said warning does not make
a whole lot of sense.

So let's just work around it by casting the pointer first to intptr_t
and then casting up/down to the final integral type.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-26 13:24:03 -07:00
Junio C Hamano
78891795df Merge branch 'jk/war-on-sprintf'
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.

Macintosh-specific breakage was noticed and corrected in this
reroll.

* jk/war-on-sprintf: (70 commits)
  name-rev: use strip_suffix to avoid magic numbers
  use strbuf_complete to conditionally append slash
  fsck: use for_each_loose_file_in_objdir
  Makefile: drop D_INO_IN_DIRENT build knob
  fsck: drop inode-sorting code
  convert strncpy to memcpy
  notes: document length of fanout path with a constant
  color: add color_set helper for copying raw colors
  prefer memcpy to strcpy
  help: clean up kfmclient munging
  receive-pack: simplify keep_arg computation
  avoid sprintf and strcpy with flex arrays
  use alloc_ref rather than hand-allocating "struct ref"
  color: add overflow checks for parsing colors
  drop strcpy in favor of raw sha1_to_hex
  use sha1_to_hex_r() instead of strcpy
  daemon: use cld->env_array when re-spawning
  stat_tracking_info: convert to argv_array
  http-push: use an argv_array for setup_revisions
  fetch-pack: use argv_array for index-pack / unpack-objects
  ...
2015-10-20 15:24:01 -07:00
Junio C Hamano
db5adf24bf Merge branch 'js/clone-dissociate'
"git clone --dissociate" runs a big "git repack" process at the
end, and it helps to close file descriptors that are open on the
packs and their idx files before doing so on filesystems that
cannot remove a file that is still open.

* js/clone-dissociate:
  clone --dissociate: avoid locking pack files
  sha1_file.c: add a function to release all packs
  sha1_file: consolidate code to close a pack's file descriptor
  t5700: demonstrate a Windows file locking issue with `git clone --dissociate`
2015-10-15 15:43:49 -07:00
Johannes Schindelin
38849a8116 sha1_file.c: add a function to release all packs
On Windows, files that are in use cannot be removed or renamed. That
means that we have to release pack files when we are about to, say,
repack them. Let's introduce a convenient function to close all the
pack files and their idx files.

While at it, we consolidate the close windows/close fd/close index
stanza in `free_pack_by_name()` into the `close_pack()` function that
is used by the new `close_all_packs()` function to avoid repeated code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-07 10:47:10 -07:00
Johannes Schindelin
71fe5d7fb0 sha1_file: consolidate code to close a pack's file descriptor
There was a lot of repeated code to close the file descriptor of
a given pack. Let's just refactor this code into a single function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 14:43:58 -07:00
Jeff King
c7ab0ba340 avoid sprintf and strcpy with flex arrays
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.

But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.

It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:

 - the call signature is a little bit unwieldy:

      d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);

   You need offsetof here instead of just writing to the
   end of the base size, because we don't know how the
   struct is packed (partially this is because FLEX_ARRAY
   might not be zero, though we can account for that; but
   the size of the struct may actually be rounded up for
   alignment, and we can't know that).

 - some sites do clever things, like over-allocating because
   they know they will write larger things into the buffer
   later (e.g., struct packed_git here).

So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King
d4b3d11a03 write_loose_object: convert to strbuf
When creating a loose object tempfile, we use a fixed
PATH_MAX-sized buffer, and strcpy directly into it. This
isn't buggy, because we do a rough check of the size, but
there's no verification that our guesstimate of the required
space is enough (in fact, it's several bytes too big for the
current naming scheme).

Let's switch to a strbuf, which makes this much easier to
verify. The allocation overhead should be negligible, since
we are replacing a static buffer with a static strbuf, and
we'll only need to allocate on the first call.

While we're here, we can also document a subtle interaction
with mkstemp that would be easy to overlook.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King
ac5190cc48 sha1_get_pack_name: use a strbuf
We do some manual memory computation here, and there's no
check that our 60 is not overflowed by the raw sprintf (it
isn't, because the "which" parameter is never longer than
"pack"). We can simplify this greatly with a strbuf.

Technically the end result is not identical, as the original
took care not to rewrite the object directory on each call
for performance reasons.  We could do that here, too (by
saving the baselen and resetting to it), but it's not worth
the complexity; this function is not called a lot (generally
once per packfile that we open).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King
9ae97018fb use strip_suffix and xstrfmt to replace suffix
When we want to convert "foo.pack" to "foo.idx", we do it by
duplicating the original string and then munging the bytes
in place. Let's use strip_suffix and xstrfmt instead, which
has several advantages:

  1. It's more clear what the intent is.

  2. It does not implicitly rely on the fact that
     strlen(".idx") <= strlen(".pack") to avoid an overflow.

  3. We communicate the assumption that the input file ends
     with ".pack" (and get a run-time check that this is so).

  4. We drop calls to strcpy, which makes auditing the code
     base easier.

Likewise, we can do this to convert ".pack" to ".bitmap",
avoiding some manual memory computation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King
48bcc1c3cc add_packed_git: convert strcpy into xsnprintf
We have the path "foo.idx", and we create a buffer big
enough to hold "foo.pack" and "foo.keep", and then strcpy
straight into it. This isn't a bug (we have enough space),
but it's very hard to tell from the strcpy that this is so.

Let's instead use strip_suffix to take off the ".idx",
record the size of our allocation, and use xsnprintf to make
sure we don't violate our assumptions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King
ef1286d3c0 use xsnprintf for generating git object headers
We generally use 32-byte buffers to format git's "type size"
header fields. These should not generally overflow unless
you can produce some truly gigantic objects (and our types
come from our internal array of constant strings). But it is
a good idea to use xsnprintf to make sure this is the case.

Note that we slightly modify the interface to
write_sha1_file_prepare, which nows uses "hdrlen" as an "in"
parameter as well as an "out" (on the way in it stores the
allocated size of the header, and on the way out it returns
the ultimate size of the header).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano
f0bc854623 Sync with 2.5.2 2015-09-09 14:30:35 -07:00
Junio C Hamano
3d3caf0b78 Sync with 2.4.9 2015-09-04 10:43:23 -07:00
Junio C Hamano
ef0e938a1a Sync with 2.3.9 2015-09-04 10:34:19 -07:00
Junio C Hamano
8267cd11d6 Sync with 2.2.3 2015-09-04 10:29:28 -07:00
Jeff King
5015f01c12 read_info_alternates: handle paths larger than PATH_MAX
This function assumes that the relative_base path passed
into it is no larger than PATH_MAX, and writes into a
fixed-size buffer. However, this path may not have actually
come from the filesystem; for example, add_submodule_odb
generates a path using a strbuf and passes it in. This is
hard to trigger in practice, though, because the long
submodule directory would have to exist on disk before we
would try to open its info/alternates file.

We can easily avoid the bug, though, by simply creating the
filename on the heap.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-04 09:36:51 -07:00
Junio C Hamano
cbcd3dcaa8 Merge branch 'cb/open-noatime-clear-errno' into maint
When trying to see that an object does not exist, a state errno
leaked from our "first try to open a packfile with O_NOATIME and
then if it fails retry without it" logic on a system that refuses
O_NOATIME.  This confused us and caused us to die, saying that the
packfile is unreadable, when we should have just reported that the
object does not exist in that packfile to the caller.

* cb/open-noatime-clear-errno:
  git_open_noatime: return with errno=0 on success
2015-09-03 19:17:49 -07:00
Junio C Hamano
9b8d731995 Merge branch 'cb/open-noatime-clear-errno'
When trying to see that an object does not exist, a state errno
leaked from our "first try to open a packfile with O_NOATIME and
then if it fails retry without it" logic on a system that refuses
O_NOATIME.  This confused us and caused us to die, saying that the
packfile is unreadable, when we should have just reported that the
object does not exist in that packfile to the caller.

* cb/open-noatime-clear-errno:
  git_open_noatime: return with errno=0 on success
2015-08-25 14:57:10 -07:00
Junio C Hamano
8c9155e031 Merge branch 'jk/git-path'
git_path() and mkpath() are handy helper functions but it is easy
to misuse, as the callers need to be careful to keep the number of
active results below 4.  Their uses have been reduced.

* jk/git-path:
  memoize common git-path "constant" files
  get_repo_path: refactor path-allocation
  find_hook: keep our own static buffer
  refs.c: remove_empty_directories can take a strbuf
  refs.c: avoid git_path assignment in lock_ref_sha1_basic
  refs.c: avoid repeated git_path calls in rename_tmp_log
  refs.c: simplify strbufs in reflog setup and writing
  path.c: drop git_path_submodule
  refs.c: remove extra git_path calls from read_loose_refs
  remote.c: drop extraneous local variable from migrate_file
  prefer mkpathdup to mkpath in assignments
  prefer git_pathdup to git_path in some possibly-dangerous cases
  add_to_alternates_file: don't add duplicate entries
  t5700: modernize style
  cache.h: complete set of git_path_submodule helpers
  cache.h: clarify documentation for git_path, et al
2015-08-19 14:48:56 -07:00
Junio C Hamano
51a22ce147 Merge branch 'jc/finalize-temp-file'
Long overdue micro clean-up.

* jc/finalize-temp-file:
  sha1_file.c: rename move_temp_to_file() to finalize_object_file()
2015-08-19 14:48:55 -07:00
Junio C Hamano
0a489b0680 prepare_packed_git(): refactor garbage reporting in pack directory
The hook to report "garbage" files in $GIT_OBJECT_DIRECTORY/pack/
could be generic but is too specific to count-object's needs.

Move the part to produce human-readable messages to count-objects,
and refine the interface to callback with the "bits" with values
defined in the cache.h header file, so that other callers (e.g.
prune) can later use the same mechanism to enumerate different
kinds of garbage files and do something intelligent about them,
other than reporting in textual messages.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-17 09:14:59 -07:00
Clemens Buchacher
dff6f280df git_open_noatime: return with errno=0 on success
In read_sha1_file_extended we die if read_object fails with a fatal
error. We detect a fatal error if errno is non-zero and is not
ENOENT. If the object could not be read because it does not exist,
this is not considered a fatal error and we want to return NULL.

Somewhere down the line, read_object calls git_open_noatime to open
a pack index file, for example. We first try open with O_NOATIME.
If O_NOATIME fails with EPERM, we retry without O_NOATIME. When the
second open succeeds, errno is however still set to EPERM from the
first attempt. When we finally determine that the object does not
exist, read_object returns NULL and read_sha1_file_extended dies
with a fatal error:

    fatal: failed to read object <sha1>: Operation not permitted

Fix this by resetting errno to zero before we call open again.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Clemens Buchacher <clemens.buchacher@intel.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-12 13:56:19 -07:00
Jeff King
77b9b1d13a add_to_alternates_file: don't add duplicate entries
The add_to_alternates_file function blindly uses
hold_lock_file_for_append to copy the existing contents, and
then adds the new line to it. This has two minor problems:

  1. We might add duplicate entries, which are ugly and
     inefficient.

  2. We do not check that the file ends with a newline, in
     which case we would bogusly append to the final line.
     This is quite unlikely in practice, though, as we call
     this function only from git-clone, so presumably we are
     the only writers of the file (and we always add a
     newline).

Instead of using hold_lock_file_for_append, let's copy the
file line by line, which ensures all records are properly
terminated. If we see an extra line, we can simply abort the
update (there is no point in even copying the rest, as we
know that it would be identical to the original).

As a bonus, we also get rid of some calls to the
static-buffer mkpath and git_path functions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:15:42 -07:00
Junio C Hamano
cb5add5868 sha1_file.c: rename move_temp_to_file() to finalize_object_file()
Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
loosen, 2009-03-25), we kept reminding ourselves:

    NEEDSWORK: this should be renamed to finalize_temp_file() as
    "moving" is only a part of what it does, when no patch between
    master to pu changes the call sites of this function.

without doing anything about it.  Let's do so.

The purpose of this function was not to move but to finalize.  The
detail of the primarily implementation of finalizing was to link the
temporary file to its final name and then to unlink, which wasn't
even "moving".  The alternative implementation did "move" by calling
rename(2), which is a fun tangent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 11:10:37 -07:00
Junio C Hamano
7c696007ca Merge branch 'jk/fix-refresh-utime' into maint
Fix a small bug in our use of umask() return value.

* jk/fix-refresh-utime:
  check_and_freshen_file: fix reversed success-check
2015-07-27 12:21:40 -07:00
Junio C Hamano
de62fe8c42 Merge branch 'jk/index-pack-reduce-recheck' into maint
Disable "have we lost a race with competing repack?" check while
receiving a huge object transfer that runs index-pack.

* jk/index-pack-reduce-recheck:
  index-pack: avoid excessive re-reading of pack directory
2015-07-27 12:21:38 -07:00
Junio C Hamano
1f9e0a5348 Merge branch 'jk/fix-refresh-utime'
Fix a small bug in our use of umask() return value.

* jk/fix-refresh-utime:
  check_and_freshen_file: fix reversed success-check
2015-07-10 14:26:15 -07:00
Junio C Hamano
c07173f215 Merge branch 'jk/maint-for-each-packed-object'
The for_each_packed_object() API function did not iterate over
objects in a packfile that hasn't been used yet.

* jk/maint-for-each-packed-object:
  for_each_packed_object: automatically open pack index
2015-07-09 14:31:43 -07:00
Jeff King
3096b2ecdb check_and_freshen_file: fix reversed success-check
When we want to write out a loose object file, we have
always first made sure we don't already have the object
somewhere. Since 33d4221 (write_sha1_file: freshen existing
objects, 2014-10-15), we also update the timestamp on the
file, so that a simultaneous prune knows somebody is
likely to reference it soon.

If our utime() call fails, we treat this the same as not
having the object in the first place; the safe thing to do
is write out another copy. However, the loose-object check
accidentally inverts the utime() check; it returns failure
_only_ when the utime() call actually succeeded. Thus it was
failing to protect us there, and in the normal case where
utime() succeeds, it caused us to pointlessly write out and
link the object.

This passed our freshening tests, because writing out the
new object is certainly _one_ way of updating its utime. So
the normal case was inefficient, but not wrong.

While we're here, let's also drop a comment in front of the
check_and_freshen functions, making a note of their return
type (since it is not our usual "0 for success, -1 for
error").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-08 15:58:28 -07:00
Junio C Hamano
c5baf18a40 Merge branch 'jk/diagnose-config-mmap-failure' into maint
The configuration reader/writer uses mmap(2) interface to access
the files; when we find a directory, it barfed with "Out of memory?".

* jk/diagnose-config-mmap-failure:
  xmmap(): drop "Out of memory?"
  config.c: rewrite ENODEV into EISDIR when mmap fails
  config.c: avoid xmmap error messages
  config.c: fix mmap leak when writing config
  read-cache.c: drop PROT_WRITE from mmap of index
2015-06-25 11:02:11 -07:00
Junio C Hamano
712b351bd3 Merge branch 'jk/index-pack-reduce-recheck'
Disable "have we lost a race with competing repack?" check while
receiving a huge object transfer that runs index-pack.

* jk/index-pack-reduce-recheck:
  index-pack: avoid excessive re-reading of pack directory
2015-06-24 12:21:54 -07:00
Jeff King
f813e9ea5f for_each_packed_object: automatically open pack index
When for_each_packed_object is called, we call
prepare_packed_git() to make sure we have the actual list of
packs. But the latter does not actually open the pack
indices, meaning that pack->nr_objects may simply be 0 if
the pack has not otherwise been used since the program
started.

In practice, this didn't come up for the current callers,
because they iterate the packed objects only after iterating
all reachable objects (so for it to matter you would have to
have a pack consisting only of unreachable objects). But it
is a dangerous and confusing interface that should be fixed
for future callers.

Note that we do not end the iteration when a pack cannot be
opened, but we do return an error. That lets you complete
the iteration even in actively-repacked repository where an
.idx file may racily go away, but it also lets callers know
that they may not have gotten the complete list (which the
current reachability-check caller does care about).

We have to tweak one of the prune tests due to the changed
return value; an earlier test creates bogus .idx files and
does not clean them up. Having to make this tweak is a good
thing; it means we will not prune in a broken repository,
and the test confirms that we do not negatively impact a
more lenient caller, count-objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 14:53:58 -07:00
Junio C Hamano
070d276cc1 Merge branch 'jh/filter-empty-contents' into maint
The clean/smudge interface did not work well when filtering an
empty contents (failed and then passed the empty input through).
It can be argued that a filter that produces anything but empty for
an empty input is nonsense, but if the user wants to do strange
things, then why not?

* jh/filter-empty-contents:
  sha1_file: pass empty buffer to index empty file
2015-06-16 14:33:44 -07:00
Junio C Hamano
dee47925c1 Merge branch 'jk/diagnose-config-mmap-failure'
The configuration reader/writer uses mmap(2) interface to access
the files; when we find a directory, it barfed with "Out of memory?".

* jk/diagnose-config-mmap-failure:
  xmmap(): drop "Out of memory?"
  config.c: rewrite ENODEV into EISDIR when mmap fails
  config.c: avoid xmmap error messages
  config.c: fix mmap leak when writing config
  read-cache.c: drop PROT_WRITE from mmap of index
2015-06-11 09:29:55 -07:00
Jeff King
0eeb077be7 index-pack: avoid excessive re-reading of pack directory
Since 45e8a74 (has_sha1_file: re-check pack directory before
giving up, 2013-08-30), we spend extra effort for
has_sha1_file to give the right answer when somebody else is
repacking. Usually this effort does not matter, because
after finding that the object does not exist, the next step
is usually to die().

However, some code paths make a large number of
has_sha1_file checks which are _not_ expected to return 1.
The collision test in index-pack.c is such a case. On a
local system, this can cause a performance slowdown of
around 5%. But on a system with high-latency system calls
(like NFS), it can be much worse.

This patch introduces a "quick" flag to has_sha1_file which
callers can use when they would prefer high performance at
the cost of false negatives during repacks. There may be
other code paths that can use this, but the index-pack one
is the most obviously critical, so we'll start with
switching that one.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-09 12:26:35 -07:00
Junio C Hamano
3c91e9966a Merge branch 'jk/sha1-file-reduce-useless-warnings' into maint
* jk/sha1-file-reduce-useless-warnings:
  sha1_file: squelch "packfile cannot be accessed" warnings
2015-06-05 12:00:07 -07:00
Junio C Hamano
152722f155 Merge branch 'jh/filter-empty-contents'
The clean/smudge interface did not work well when filtering an
empty contents (failed and then passed the empty input through).
It can be argued that a filter that produces anything but empty for
an empty input is nonsense, but if the user wants to do strange
things, then why not?

* jh/filter-empty-contents:
  sha1_file: pass empty buffer to index empty file
2015-06-01 12:45:11 -07:00
Junio C Hamano
9ca0aaf6de xmmap(): drop "Out of memory?"
We show that message with die_errno(), but the OS is ought to know
why mmap(2) failed much better than we do.  There is no reason for
us to say "Out of memory?" here.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-28 11:35:25 -07:00
Jeff King
1570856b51 config.c: avoid xmmap error messages
The config-writing code uses xmmap to map the existing
config file, which will die if the map fails. This has two
downsides:

  1. The error message is not very helpful, as it lacks any
     context about the file we are mapping:

       $ mkdir foo
       $ git config --file=foo some.key value
       fatal: Out of memory? mmap failed: No such device

  2. We normally do not die in this code path; instead, we'd
     rather report the error and return an appropriate exit
     status (which is part of the public interface
     documented in git-config.1).

This patch introduces a "gentle" form of xmmap which lets us
produce our own error message. We do not want to use mmap
directly, because we would like to use the other
compatibility elements of xmmap (e.g., handling 0-length
maps portably).

The end result is:

    $ git.compile config --file=foo some.key value
    error: unable to mmap 'foo': No such device
    $ echo $?
    3

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-28 11:33:18 -07:00
Junio C Hamano
1e6c8babf8 Merge branch 'jc/hash-object' into maint
"hash-object --literally" introduced in v2.2 was not prepared to
take a really long object type name.

* jc/hash-object:
  write_sha1_file(): do not use a separate sha1[] array
  t1007: add hash-object --literally tests
  hash-object --literally: fix buffer overrun with extra-long object type
  git-hash-object.txt: document --literally option
2015-05-26 13:49:25 -07:00
Junio C Hamano
3b7d373ae2 Merge branch 'kn/cat-file-literally'
Add the "--allow-unknown-type" option to "cat-file" to allow
inspecting loose objects of an experimental or a broken type.

* kn/cat-file-literally:
  t1006: add tests for git cat-file --allow-unknown-type
  cat-file: teach cat-file a '--allow-unknown-type' option
  cat-file: make the options mutually exclusive
  sha1_file: support reading from a loose object of unknown type
2015-05-19 13:17:58 -07:00
Jim Hill
f6a1e1e288 sha1_file: pass empty buffer to index empty file
`git add` of an empty file with a filter pops complaints from
`copy_fd` about a bad file descriptor.

This traces back to these lines in sha1_file.c:index_core:

	if (!size) {
		ret = index_mem(sha1, NULL, size, type, path, flags);

The problem here is that content to be added to the index can be
supplied from an fd, or from a memory buffer, or from a pathname. This
call is supplying a NULL buffer pointer and a zero size.

Downstream logic takes the complete absence of a buffer to mean the
data is to be found elsewhere -- for instance, these, from convert.c:

	if (params->src) {
		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
	} else {
		write_err = copy_fd(params->fd, child_process.in);
	}

~If there's a buffer, write from that, otherwise the data must be coming
from an open fd.~

Perfectly reasonable logic in a routine that's going to write from
either a buffer or an fd.

So change `index_core` to supply an empty buffer when indexing an empty
file.

There's a patch out there that instead changes the logic quoted above to
take a `-1` fd to mean "use the buffer", but it seems to me that the
distinction between a missing buffer and an empty one carries intrinsic
semantics, where the logic change is adapting the code to handle
incorrect arguments.

Signed-off-by: Jim Hill <gjthill@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-18 10:15:20 -07:00
Junio C Hamano
ebb464f0cb Merge branch 'jk/prune-mtime' into maint
Access to objects in repositories that borrow from another one on a
slow NFS server unnecessarily got more expensive due to recent code
becoming more cautious in a naive way not to lose objects to pruning.

* jk/prune-mtime:
  sha1_file: only freshen packs once per run
  sha1_file: freshen pack objects before loose
  reachable: only mark local objects as recent
2015-05-13 14:05:50 -07:00
Junio C Hamano
051086b947 Merge branch 'jc/hash-object'
"hash-object --literally" introduced in v2.2 was not prepared to
take a really long object type name.

* jc/hash-object:
  write_sha1_file(): do not use a separate sha1[] array
  t1007: add hash-object --literally tests
  hash-object --literally: fix buffer overrun with extra-long object type
  git-hash-object.txt: document --literally option
2015-05-11 14:23:59 -07:00
Junio C Hamano
cedeffeee0 Merge branch 'jk/sha1-file-reduce-useless-warnings'
* jk/sha1-file-reduce-useless-warnings:
  sha1_file: squelch "packfile cannot be accessed" warnings
2015-05-11 14:23:41 -07:00
Junio C Hamano
68a2e6a2c8 Merge branch 'nd/multiple-work-trees'
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.

* nd/multiple-work-trees: (41 commits)
  prune --worktrees: fix expire vs worktree existence condition
  t1501: fix test with split index
  t2026: fix broken &&-chain
  t2026 needs procondition SANITY
  git-checkout.txt: a note about multiple checkout support for submodules
  checkout: add --ignore-other-wortrees
  checkout: pass whole struct to parse_branchname_arg instead of individual flags
  git-common-dir: make "modules/" per-working-directory directory
  checkout: do not fail if target is an empty directory
  t2025: add a test to make sure grafts is working from a linked checkout
  checkout: don't require a work tree when checking out into a new one
  git_path(): keep "info/sparse-checkout" per work-tree
  count-objects: report unused files in $GIT_DIR/worktrees/...
  gc: support prune --worktrees
  gc: factor out gc.pruneexpire parsing code
  gc: style change -- no SP before closing parenthesis
  checkout: clean up half-prepared directories in --to mode
  checkout: reject if the branch is already checked out elsewhere
  prune: strategies for linked checkouts
  checkout: support checking out into a new working directory
  ...
2015-05-11 14:23:39 -07:00
Karthik Nayak
46f034483e sha1_file: support reading from a loose object of unknown type
Update sha1_loose_object_info() to optionally allow it to read
from a loose object file of unknown/bogus type; as the function
usually returns the type of the object it read in the form of enum
for known types, add an optional "typename" field to receive the
name of the type in textual form and a flag to indicate the reading
of a loose object file of unknown/bogus type.

Add parse_sha1_header_extended() which acts as a wrapper around
parse_sha1_header() allowing more information to be obtained.

Add unpack_sha1_header_to_strbuf() to unpack sha1 headers of
unknown/corrupt objects which have a unknown sha1 header size to
a strbuf structure. This was written by Junio C Hamano but tested
by me.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Hepled-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-06 13:35:48 -07:00
Junio C Hamano
e3b199aef1 Merge branch 'jk/prune-mtime'
Access to objects in repositories that borrow from another one on a
slow NFS server unnecessarily got more expensive due to recent code
becoming more cautious in a naive way not to lose objects to pruning.

* jk/prune-mtime:
  sha1_file: only freshen packs once per run
  sha1_file: freshen pack objects before loose
  reachable: only mark local objects as recent
2015-05-05 21:00:37 -07:00