Commit Graph

43954 Commits

Author SHA1 Message Date
Junio C Hamano
7725bebe21 Merge branch 'sb/submodule-parallel-fetch'
Fix recently introduced codepaths that are involved in parallel
submodule operations, which gave up on reading too early, and
could have wasted CPU while attempting to write under a corner
case condition.

* sb/submodule-parallel-fetch:
  hoist out handle_nonblock function for xread and xwrite
  xwrite: poll on non-blocking FDs
  xread: retry after poll on EAGAIN/EWOULDBLOCK
2016-07-19 13:22:15 -07:00
Junio C Hamano
e0e56cbf7f Merge branch 'lf/recv-sideband-cleanup'
Code simplification.

* lf/recv-sideband-cleanup:
  sideband.c: small optimization of strbuf usage
  sideband.c: refactor recv_sideband()
2016-07-19 13:22:14 -07:00
Junio C Hamano
7418a6b1a0 Merge branch 'dk/blame-move-no-reason-for-1-line-context'
"git blame -M" missed a single line that was moved within the file.

* dk/blame-move-no-reason-for-1-line-context:
  blame: require 0 context lines while finding moved lines with -M
2016-07-19 13:22:13 -07:00
Junio C Hamano
dc21164e66 Merge branch 'nd/connect-ssh-command-config'
A new configuration variable core.sshCommand has been added to
specify what value for GIT_SSH_COMMAND to use per repository.

* nd/connect-ssh-command-config:
  connect: read $GIT_SSH_COMMAND from config file
2016-07-19 13:22:12 -07:00
René Scharfe
508a285cea submodule-config: use explicit empty string instead of strbuf in config_from()
Use a string constant instead of an empty strbuf to shorten the code
and make it easier to read.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-19 12:16:57 -07:00
René Scharfe
8109984d61 use strbuf_addbuf() for appending a strbuf to another
Use strbuf_addbuf() where possible; it's shorter and more efficient.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-19 11:48:35 -07:00
John Keeping
9ec26e7977 difftool: fix argument handling in subdirs
When in a subdirectory of a repository, path arguments should be
interpreted relative to the current directory not the root of the
working tree.

The Git::repository object passed into setup_dir_diff() is configured to
handle this correctly but we create a new Git::repository here without
setting the WorkingSubdir argument.  By simply using the existing
repository, path arguments are handled relative to the current
directory.

Reported-by: Bernhard Kirchen <bernhard.kirchen@rwth-aachen.de>
Signed-off-by: John Keeping <john@keeping.me.uk>
Acked-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-19 11:12:27 -07:00
Eric Wong
cec9264f17 git-svn: document svn.authorsProg in config
This has always been supported since we read config variables
based on the command-line option parser.  Document it explicitly
since users usually want to maintain the same program across
invocations.

Signed-off-by: Eric Wong <e@80x24.org>
2016-07-19 10:07:19 +00:00
Johannes Schindelin
90cf590f53 fsck: optionally show more helpful info for broken links
When reporting broken links between commits/trees/blobs, it would be
quite helpful at times if the user would be told how the object is
supposed to be reachable.

With the new --name-objects option, git-fsck will try to do exactly
that: name the objects in a way that shows how they are reachable.

For example, when some reflog got corrupted and a blob is missing that
should not be, the user might want to remove the corresponding reflog
entry. This option helps them find that entry: `git fsck` will now
report something like this:

	broken link from    tree b5eb6ff...  (refs/stash@{<date>}~37:)
	              to    blob ec5cf80...

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 15:15:59 -07:00
Mike Hommey
c66b470082 t/t8003-blame-corner-cases.sh: Use here documents
Somehow, this test was using:

{
	echo A
	echo B
} > file

block to feed file contents. This changes those to the form most common
in git test scripts:

cat >file <<-\EOF
A
B
EOF

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 14:33:38 -07:00
Mike Hommey
3b75ee9327 blame: allow to blame paths freshly added to the index
When blaming files, changes in the work tree are taken into account
and displayed as being "Not Committed Yet".

However, when blaming a file that is not known to the current HEAD,
git blame fails with `no such path 'foo' in HEAD`, even when the file
was git add'ed.

Allowing such a blame is useful when the new file added to the index
(not yet committed) was created by renaming an existing file.  It
also is useful when the new file was created from pieces already in
HEAD, moved or copied from other files and blaming with copy
detection (i.e. "-C").

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 14:33:38 -07:00
Nguyễn Thái Ngọc Duy
6d6a782fbf cache-tree: do not generate empty trees as a result of all i-t-a subentries
If a subdirectory contains nothing but i-t-a entries, we generate an
empty tree object and add it to its parent tree. Which is wrong. Such
a subdirectory should not be added.

Note that this has a cascading effect. If subdir 'a/b/c' contains
nothing but i-t-a entries, we ignore it. But then if 'a/b' contains
only (the non-existing) 'a/b/c', then we should ignore 'a/b' while
building 'a' too. And it goes all the way up to top directory.

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 13:45:33 -07:00
Nguyễn Thái Ngọc Duy
c041d54a74 cache-tree.c: fix i-t-a entry skipping directory updates sometimes
Commit 3cf773e (cache-tree: fix writing cache-tree when CE_REMOVE is
present - 2012-12-16) skips i-t-a entries when building trees objects
from the index. Unfortunately it may skip too much.

The code in question checks if an entry is an i-t-a one, then no tree
entry will be written. But it does not take into account that
directories can also be written with the same code. Suppose we have
this in the index.

    a-file
    subdir/file1
    subdir/file2
    subdir/file3
    the-last-file

We write an entry for a-file as normal and move on to subdir/file1,
where we realize the entry name for this level is simply just
"subdir", write down an entry for "subdir" then jump three items ahead
to the-last-file.

That is what happens normally when the first file in subdir is not an
i-t-a entry. If subdir/file1 is an i-t-a, because of the broken
condition in this code, we still think "subdir" is an i-t-a file and
not writing "subdir" down and jump to the-last-file. The result tree
now only has two items: a-file and the-last-file. subdir should be
there too (even though it only records two sub-entries, file2 and
file3).

If the i-t-a entry is subdir/file2 or subdir/file3, this is not a
problem because we jump over them anyway. Which may explain why the
bug is hidden for nearly four years.

Fix it by making sure we only skip i-t-a entries when the entry in
question is actual an index entry, not a directory.

Reported-by: Yuri Kanivetsky <yuri.kanivetsky@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 13:45:33 -07:00
Nguyễn Thái Ngọc Duy
378932d3c3 test-lib.sh: introduce and use $EMPTY_BLOB
Similar to $EMPTY_TREE this makes it easier to recognize this special
SHA-1 and change hash later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 13:45:32 -07:00
Nguyễn Thái Ngọc Duy
f9e7d9f8c3 test-lib.sh: introduce and use $EMPTY_TREE
This is a special SHA1. Let's keep it at one place, easier to replace
later when the hash change comes, easier to recognize.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 13:45:32 -07:00
Johannes Schindelin
1cd772cc41 fsck: give the error function a chance to see the fsck_options
We will need this in the next commit, where fsck will be taught to
optionally name the objects when reporting issues about them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:35:00 -07:00
Johannes Schindelin
7b35efd734 fsck_walk(): optionally name objects on the go
If fsck_options->name_objects is initialized, and if it already has
name(s) for the object(s) that are to be the starting point(s) for
fsck_walk(), then that function will now add names for the objects
that were walked.

This will be highly useful for teaching git-fsck to identify root causes
for broken links, which is the task for the next patch in this series.

Note that this patch opts for decorating the objects with plain strings
instead of full-blown structs (à la `struct rev_name` in the code of
the `git name-rev` command), for several reasons:

- the code is much simpler than if it had to work with structs that
  describe arbitrarily long names such as "master~14^2~5:builtin/am.c",

- the string processing is actually quite light-weight compared to the
  rest of fsck's operation,

- the caller of fsck_walk() is expected to provide names for the
  starting points, and using plain and simple strings is just the
  easiest way to do that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:35:00 -07:00
Johannes Schindelin
993a21b0a0 fsck: refactor how to describe objects
In many places, we refer to objects via their SHA-1s. Let's abstract
that into a function.

For the moment, it does nothing else than what we did previously: print
out the 40-digit hex string. But that will change over the course of the
next patches.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:35:00 -07:00
Eric Wong
a9b02de8b7 configure.ac: stronger test for pthread linkage
We need to test linkage of pthread_create and pthread_join,
as pthread_mutex_* and pthread_key_* functions do not need
extra linkage under FreeBSD 10.3, leading to a false-positive
of the empty case.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:22:35 -07:00
Eric Wong
49c58d86ce daemon: ignore ENOTSOCK from setsockopt
In inetd mode, we are not guaranteed stdin or stdout is a
socket; callers could filter the data through a pipe
or be testing with regular files.

This prevents t5802 from polluting syslog.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:09:52 -07:00
Eric Wong
ecba19531a list: avoid incompatibility with *BSD sys/queue.h
The OS X build pulls in sys/queue.h, which pollutes the preprocessor
namespace with a macro generically named LIST_HEAD, and clashes with
the name we use here.

ref: http://mid.gmane.org/FB76544F-16F7-45CA-9649-FD62EE44B0DE@gmail.com

Reported-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:06:51 -07:00
Junio C Hamano
29493589e9 archive-tar: huge offset and future timestamps would not work on 32-bit
As we are not yet moving everything to size_t but still using ulong
internally when talking about the size of object, platforms with
32-bit long will not be able to produce tar archive with 4GB+ file,
and cannot grok 077777777777UL as a constant.  Disable the extended
header feature and do not test it on them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-15 10:51:55 -07:00
Junio C Hamano
82246e075e Sync with 2.9.2
* maint:
  Git 2.9.2
  t0006: skip "far in the future" test when unsigned long is not long enough
2016-07-15 10:49:23 -07:00
Junio C Hamano
e634160bf4 Git 2.9.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-15 10:48:16 -07:00
Junio C Hamano
33eacd3ff4 Merge branch 'jk/tzoffset-fix' into maint
Skip tests that are unrunnable on platforms without 64-bit long
to avoid unnecessary test failures.

* jk/tzoffset-fix:
  t0006: skip "far in the future" test when unsigned long is not long enough
2016-07-15 09:43:42 -07:00
Jeff King
6b9c38e14c t0006: skip "far in the future" test when unsigned long is not long enough
Git's source code refers to timestamps as unsigned longs.  On 32-bit
platforms, as well as on Windows, unsigned long is not large enough
to capture dates that are "absurdly far in the future".

While we can fix this issue properly by replacing unsigned long with
a larger type, we want to be a bit more conservative and just skip
those tests on the maint track.

Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-15 09:05:53 -07:00
Stefan Beller
3ac870300a add a test for push options
The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:41 -07:00
Stefan Beller
f6a4e61fbb push: accept push options
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:41 -07:00
Stefan Beller
c714e45f87 receive-pack: implement advertising and receiving push options
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:40 -07:00
Stefan Beller
77a9745d19 push options: {pre,post}-receive hook learns about push options
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable
============================================
+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
======================================
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==========================================================
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
======================================================
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:17 -07:00
Junio C Hamano
16726cfa0c diff: document diff-filter exclusion
In v1.8.5 days, 7f2ea5f0 (diff: allow lowercase letter to specify
what change class to exclude, 2013-07-17) taught the "--diff-filter"
mechanism to take lowercase letters as exclusion, but we forgot to
document it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 12:17:47 -07:00
Junio C Hamano
75676c8c8b Merge branch 'jk/upload-pack-hook'
A hot-fix to make a test working in mingw again.

* jk/upload-pack-hook:
  mingw: fix regression in t1308-config-set
2016-07-14 10:38:57 -07:00
Johannes Schindelin
b738396cfd mingw: fix regression in t1308-config-set
When we tried to fix in 58461bd (t1308: do not get fooled by symbolic
links to the source tree, 2016-06-02) an obscure case where the user
cd's into Git's source code via a symbolic link, a regression was
introduced that affects all test runs on Windows.

The original patch introducing the test case in question was careful to
use `$(pwd)` instead of `$PWD`.

This was done to account for the fact that Git's test suite uses shell
scripting even on Windows, where the shell's Unix-y paths are
incompatible with the main Git executable's idea of paths: it only
accepts Windows paths.

It is an awkward but necessary thing, then, to use `$(pwd)` (which gives
us a Windows path) when interacting with the Git executable and `$PWD`
(which gives the shell's idea of the current working directory in Unix-y
form) for shell scripts, including the test suite itself.

Obviously this broke the use case of the Git maintainer when changing
the working directory into Git's source code directory via a symlink,
i.e. when `$(pwd)` does not agree with `$PWD`.

However, we must not fix that use case at the expense of regressing
another use case.

Let's special-case Windows here, even if it is ugly, for lack of a more
elegant solution.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 10:38:28 -07:00
Jeff King
882d49ca5c push: anonymize URL in status output
Commit 47abd85 (fetch: Strip usernames from url's before
storing them, 2009-04-17) taught fetch to anonymize URLs.
The primary purpose there was to avoid sticking passwords in
merge-commit messages, but as a side effect, we also avoid
printing them to stderr.

The push side does not have the merge-commit problem, but it
probably should avoid printing them to stderr. We can reuse
the same anonymizing function.

Note that for this to come up, the credentials would have to
appear either on the command line or in a git config file,
neither of which is particularly secure. So people _should_
be switching to using credential helpers instead, which
makes this problem go away. But that's no excuse not to
improve the situation for people who for whatever reason end
up using credentials embedded in the URL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 09:23:20 -07:00
Junio C Hamano
79ed43c28f Fifth batch of topics for 2.10
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 11:30:25 -07:00
Junio C Hamano
7a23f7367d Merge branch 'jk/big-and-future-archive-tar'
"git archive" learned to handle files that are larger than 8GB and
commits far in the future than expressible by the traditional US-TAR
format.

* jk/big-and-future-archive-tar:
  archive-tar: drop return value
  archive-tar: write extended headers for far-future mtime
  archive-tar: write extended headers for file sizes >= 8GB
  t5000: test tar files that overflow ustar headers
  t9300: factor out portable "head -c" replacement
2016-07-13 11:24:18 -07:00
Junio C Hamano
42bd66816b Merge branch 'nd/ita-cleanup'
Git does not know what the contents in the index should be for a
path added with "git add -N" yet, so "git grep --cached" should not
show hits (or show lack of hits, with -L) in such a path, but that
logic does not apply to "git grep", i.e. searching in the working
tree files.  But we did so by mistake, which has been corrected.

* nd/ita-cleanup:
  grep: fix grepping for "intent to add" files
  t7810-grep.sh: fix a whitespace inconsistency
  t7810-grep.sh: fix duplicated test name
2016-07-13 11:24:18 -07:00
Junio C Hamano
5eb1e9f1a0 Merge branch 'ps/rebase-i-auto-unstash-upon-abort'
"git rebase -i --autostash" did not restore the auto-stashed change
when the operation was aborted.

* ps/rebase-i-auto-unstash-upon-abort:
  rebase -i: restore autostash on abort
2016-07-13 11:24:17 -07:00
Junio C Hamano
6c35952a08 Merge branch 'js/t3404-grammo-fix'
Grammofix.

* js/t3404-grammo-fix:
  t3404: fix a grammo (commands are ran -> commands are run)
2016-07-13 11:24:16 -07:00
Junio C Hamano
c510926691 Merge branch 'js/sign-empty-commit-fix'
"git commit --amend --allow-empty-message -S" for a commit without
any message body could have misidentified where the header of the
commit object ends.

* js/sign-empty-commit-fix:
  commit -S: avoid invalid pointer with empty message
2016-07-13 11:24:15 -07:00
Junio C Hamano
ce18123cec Merge branch 'mm/doc-tt'
More mark-up updates to typeset strings that are expected to
literally typed by the end user in fixed-width font.

* mm/doc-tt:
  doc: typeset HEAD and variants as literal
  CodingGuidelines: formatting HEAD in documentation
  doc: typeset long options with argument as literal
  doc: typeset '--' as literal
  doc: typeset long command-line options as literal
  doc: typeset short command-line options as literal
  Documentation/git-mv.txt: fix whitespace indentation
2016-07-13 11:24:14 -07:00
Junio C Hamano
fc8a3a6072 Merge branch 'dg/subtree-rebase-test'
Add a test to specify the desired behaviour that currently is not
available in "git rebase -Xsubtree=...".

* dg/subtree-rebase-test:
  contrib/subtree: Add a test for subtree rebase that loses commits
2016-07-13 11:24:13 -07:00
Junio C Hamano
7aa46d2bc8 Merge branch 'nd/doc-new-command'
Typofix in a doc.

* nd/doc-new-command:
  new-command.txt: correct the command description file
2016-07-13 11:24:12 -07:00
Junio C Hamano
97865e83c7 Merge branch 'ew/gc-auto-pack-limit-fix'
"gc.autoPackLimit" when set to 1 should not trigger a repacking
when there is only one pack, but the code counted poorly and did
so.

* ew/gc-auto-pack-limit-fix:
  gc: fix off-by-one error with gc.autoPackLimit
2016-07-13 11:24:12 -07:00
Junio C Hamano
67166a8da6 Merge branch 'ah/unpack-trees-advice-messages'
Grammofix.

* ah/unpack-trees-advice-messages:
  unpack-trees: fix English grammar in do-this-before-that messages
2016-07-13 11:24:11 -07:00
Junio C Hamano
2703572b3a Merge branch 'va/i18n-even-more'
More markings of messages for i18n, with updates to various tests
to pass GETTEXT_POISON tests.

One patch from the original submission dropped due to conflicts
with jk/upload-pack-hook, which is still in flux.

* va/i18n-even-more: (38 commits)
  t5541: become resilient to GETTEXT_POISON
  i18n: branch: mark comment when editing branch description for translation
  i18n: unmark die messages for translation
  i18n: submodule: escape shell variables inside eval_gettext
  i18n: submodule: join strings marked for translation
  i18n: init-db: join message pieces
  i18n: remote: allow translations to reorder message
  i18n: remote: mark URL fallback text for translation
  i18n: standardise messages
  i18n: sequencer: add period to error message
  i18n: merge: change command option help to lowercase
  i18n: merge: mark messages for translation
  i18n: notes: mark options for translation
  i18n: notes: mark strings for translation
  i18n: transport-helper.c: change N_() call to _()
  i18n: bisect: mark strings for translation
  t5523: use test_i18ngrep for negation
  t4153: fix negated test_i18ngrep call
  t9003: become resilient to GETTEXT_POISON
  tests: unpack-trees: update to use test_i18n* functions
  ...
2016-07-13 11:24:10 -07:00
Nguyễn Thái Ngọc Duy
ec9d224903 fsck: use streaming interface for large blobs in pack
For blobs, we want to make sure the on-disk data is not corrupted
(i.e. can be inflated and produce the expected SHA-1). Blob content is
opaque, there's nothing else inside to check for.

For really large blobs, we may want to avoid unpacking the entire blob
in memory, just to check whether it produces the same SHA-1. On 32-bit
systems, we may not have enough virtual address space for such memory
allocation. And even on 64-bit where it's not a problem, allocating a
lot more memory could result in kicking other parts of systems to swap
file, generating lots of I/O and slowing everything down.

For this particular operation, not unpacking the blob and letting
check_sha1_signature, which supports streaming interface, do the job
is sufficient. check_sha1_signature() is not shown in the diff,
unfortunately. But if will be called when "data_valid && !data" is
false.

We will call the callback function "fn" with NULL as "data". The only
callback of this function is fsck_obj_buffer(), which does not touch
"data" at all if it's a blob.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:15:29 -07:00
Nguyễn Thái Ngọc Duy
af92a645d3 pack-objects: do not truncate result in-pack object size on 32-bit systems
A typical diff will not show what's going on and you need to see full
functions. The core code is like this, at the end of of write_one()

	e->idx.offset = *offset;
	size = write_object(f, e, *offset);
	if (!size) {
		e->idx.offset = recursing;
		return WRITE_ONE_BREAK;
	}
	written_list[nr_written++] = &e->idx;

	/* make sure off_t is sufficiently large not to wrap */
	if (signed_add_overflows(*offset, size))
		die("pack too large for current definition of off_t");
	*offset += size;

Here we can see that the in-pack object size is returned by
write_object (or indirectly by write_reuse_object). And it's used to
calculate object offsets, which end up in the pack index file,
generated at the end.

If "size" overflows (on 32-bit sytems, unsigned long is 32-bit while
off_t can be 64-bit), we got wrong offsets and produce incorrect .idx
file, which may make it look like the .pack file is corrupted.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:15:17 -07:00
Nguyễn Thái Ngọc Duy
da49a7da3a index-pack: correct "offset" type in unpack_entry_data()
unpack_entry_data() receives an off_t value from unpack_raw_entry(),
which could be larger than unsigned long on 32-bit systems with large
file support. Correct the type so truncation does not happen. This
only affects bad object reporting though.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:15:08 -07:00
Nguyễn Thái Ngọc Duy
fd3e67474c index-pack: report correct bad object offsets even if they are large
Use the right type for offsets in this case, off_t, which makes a
difference on 32-bit systems with large file support, and change
formatting code accordingly.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:14:47 -07:00