Commit Graph

2341 Commits

Author SHA1 Message Date
Nguyễn Thái Ngọc Duy
efd71f8913 t/helper: add an empty test-tool program
This will become an umbrella program that absorbs most [1] t/helper
programs in. By having a single executable binary we reduce disk usage
(libgit.a is replicated by every t/helper program) and shorten link
time a bit.

Running "make --jobs=1; du -sh t/helper" with ccache fully populated,
it takes 27 seconds and 277MB at the beginning of this series, 17
seconds and 42MB at the end.

[1] There are a couple programs that will not become part of
    test-tool: test-line-buffer and test-svn-fe have extra
    dependencies and test-fake-ssh's program name has to be a single
    word for some ssh tests.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27 08:45:13 -07:00
Junio C Hamano
ae1644b08e Merge branch 'ab/perl-fixes'
Clean-up to various pieces of Perl code we have.

* ab/perl-fixes:
  perl Git::LoadCPAN: emit better errors under NO_PERL_CPAN_FALLBACKS
  Makefile: add NO_PERL_CPAN_FALLBACKS knob
  perl: move the perl/Git/FromCPAN tree to perl/FromCPAN
  perl: generalize the Git::LoadCPAN facility
  perl: move CPAN loader wrappers to another namespace
  perl: update our copy of Mail::Address
  perl: update our ancient copy of Error.pm
  git-send-email: unconditionally use Net::{SMTP,Domain}
  Git.pm: hard-depend on the File::{Temp,Spec} modules
  gitweb: hard-depend on the Digest::MD5 5.8 module
  Git.pm: add the "use warnings" pragma
  Git.pm: remove redundant "use strict" from sub-package
  perl: *.pm files should not have the executable bit
2018-03-15 15:00:46 -07:00
Ævar Arnfjörð Bjarmason
ad874608d8 Makefile: optionally symlink libexec/git-core binaries to bin/git
Add a INSTALL_SYMLINKS option which if enabled, changes the default
hardlink installation method to one where the relevant binaries in
libexec/git-core are symlinked back to ../../bin, instead of being
hardlinked.

This new option also overrides the behavior of the existing
NO_*_HARDLINKS variables which in some cases would produce symlinks
within to libexec/, e.g. "git-add" symlinked to "git" which would be
copy of the "git" found in bin/, now "git-add" in libexec/ is always
going to be symlinked to the "git" found in the bin/ directory.

This option is being added because:

 1) I think it makes what we're doing a lot more obvious. E.g. I'd
    never noticed that the libexec binaries were really just hardlinks
    since e.g. ls(1) won't show that in any obvious way. You need to
    start stat(1)-ing things and look at the inodes to see what's
    going on.

 2) Some tools have very crappy support for hardlinks, e.g. the Git
    shipped with GitLab is much bigger than it should be because
    they're using a chef module that doesn't know about hardlinks, see
    https://github.com/chef/omnibus/issues/827

    I've also ran into other related issues that I think are explained
    by this, e.g. compiling git with debugging and rpm refusing to
    install a ~200MB git package with 2GB left on the FS, I think that
    was because it doesn't consider hardlinks, just the sum of the
    byte size of everything in the package.

As for the implementation, the "../../bin" noted above will vary given
some given some values of "../.." and "bin" depending on the depth of
the gitexecdir relative to the destdir, and the "bindir" target,
e.g. setting "bindir=/tmp/git/binaries gitexecdir=foo/bar/baz" will do
the right thing and produce this result:

    $ file /tmp/git/foo/bar/baz/git-add
    /tmp/git/foo/bar/baz/git-add: symbolic link to ../../../binaries/git

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:44:59 -07:00
Ævar Arnfjörð Bjarmason
a4d79b99a0 Makefile: add a gitexecdir_relative variable
This variable will be e.g. "libexec/git-core" if
gitexecdir=/tmp/git/libexec/git-core is given. It'll be used by a
subsequent change.

This is stolen from the yet-to-be integrated (needs resubmission)
"Makefile: add Perl runtime prefix support" patch on the mailing
list. See
<20180108030239.92036-3-dnj@google.com> (https://public-inbox.org/git/20180108030239.92036-3-dnj@google.com/).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:44:58 -07:00
Ævar Arnfjörð Bjarmason
7bc506d038 Makefile: fix broken bindir_relative variable
Change the bindir_relative variable to work like the other *_relative
variables, which are computed as a function of the absolute
path. Before this change, supplying e.g. bindir=/tmp/git/binaries to
the Makefile would yield a bindir_relative of just "bin", as opposed
to "binaries".

This logic was originally added back in 026fa0d5ad ("Move computation
of absolute paths from Makefile to runtime (in preparation for
RUNTIME_PREFIX)", 2009-01-18), then later in 971f85388f ("Makefile:
make mandir, htmldir and infodir absolute", 2013-02-24) when
more *_relative variables were added those new variables didn't have
this bug, but bindir_relative was never fixed.

There is a small change in behavior here, which is that setting
bindir_relative as an argument to the Makefile won't work anymore, I
think that's fine, since this was always intended as an internal
variable (e.g. INSTALL documents bindir=*, not bindir_relative=*).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:44:58 -07:00
Brandon Williams
72d0ea0056 ls-refs: introduce ls-refs server command
Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-prefixes.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:01:08 -07:00
Brandon Williams
ed10cb952d serve: introduce git-serve
Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:01:08 -07:00
Ævar Arnfjörð Bjarmason
e6c531b808 Makefile: make USE_LIBPCRE=YesPlease mean v2, not v1
Change the USE_LIBPCRE flag from being an alias for USE_LIBPCRE1 to
being an alias for USE_LIBPCRE2.

When support for v2 was added in my 94da9193a6 ("grep: add support for
PCRE v2", 2017-06-01) the existing USE_LIBPCRE flag was left as
meaning v1, with a note that this would likely change in a future
release. That optional support for v2 first made it into Git version
2.14.0.

The PCRE v2 support has been shown to be stable, and the upstream PCRE
project is highly encouraging downstream users to move to v2, so it
makes sense to give packagers of Git who haven't heard the news about
PCRE v2 a further nudge to move to v2.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 15:27:50 -07:00
Brandon Williams
74e7002961 test-pkt-line: introduce a packet-line test helper
Introduce a packet-line test helper which can either pack or unpack an
input stream into packet-lines and writes out the result to stdout.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 14:15:07 -07:00
Brandon Williams
a3d6b53e92 upload-pack: convert to a builtin
In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 14:15:06 -07:00
Junio C Hamano
f655707194 Merge branch 'ab/simplify-perl-makefile'
Hotfix for a topic already in 'master'.

* ab/simplify-perl-makefile:
  Makefile: generate Git(3pm) as dependency of the 'doc' and 'man' targets
2018-03-06 14:54:04 -08:00
Ævar Arnfjörð Bjarmason
1aca69c019 perl Git::LoadCPAN: emit better errors under NO_PERL_CPAN_FALLBACKS
Before my 20d2a30f8f ("Makefile: replace perl/Makefile.PL with simple
make rules", 2017-12-10) on an OS package that removed the
private-Error.pm copy we carried around manually removing the OS's
Error.pm would yield:

    $ git add -p
    Can't locate Error.pm in @INC (you may need to install the Error module) [...]

Now, before this change we'll instead emit this more cryptic error:

    $ git add -p
    BUG: '/usr/share/perl5/Git/FromCPAN' should be a directory! at /usr/share/perl5/Git/Error.pm line 36.

This is a confusing error. Now if the new NO_PERL_CPAN_FALLBACKS
option is specified and we can't find the module we'll instead emit:

    $ /tmp/git/bin/git add -p
    BUG: The 'Error' module is not here, but NO_PERL_CPAN_FALLBACKS was set!

    [...]

Where [...] is the lengthy explanation seen in the change below, which
explains what the potential breakage is, and how to fix it.

The reason for checking @@NO_PERL_CPAN_FALLBACKS@@] against the empty
string in Perl is as opposed to checking for a boolean value is that
that's (as far as I can tell) make's idea of a string that's set, and
e.g. NO_PERL_CPAN_FALLBACKS=0 is enough to set NO_PERL_CPAN_FALLBACKS.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05 10:52:28 -08:00
Todd Zullinger
075321c007 Makefile: add NO_PERL_CPAN_FALLBACKS knob
We include some perl modules which are not part of the core perl
install, as a convenience.  This allows us to rely on those modules in
our perl-based tools and scripts without requiring users to install the
modules from CPAN or their operating system packages.

Users whose operating system provides these modules and packagers of Git
often don't want to ship or use these bundled modules.  Allow these
users to set NO_PERL_CPAN_FALLBACKS to avoid installing the bundled
modules.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05 10:52:28 -08:00
Ævar Arnfjörð Bjarmason
382029fc00 perl: move the perl/Git/FromCPAN tree to perl/FromCPAN
Move the CPAN modules that have lived under perl/Git/FromCPAN since my
20d2a30f8f ("Makefile: replace perl/Makefile.PL with simple make
rules", 2017-12-10) to perl/FromCPAN.

A subsequent change will teach the Makefile to only install these
copies of CPAN modules if a flag that distro packagers would like to
set isn't set. Due to how the wildcard globbing is being done it's
much easier to accomplish that if they're moved to their own
directory.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05 10:52:28 -08:00
Junio C Hamano
0996727879 Merge branch 'tz/do-not-clean-spec-file'
We no longer create any *.spec file, so "make clean" should not
remove it.

* tz/do-not-clean-spec-file:
  Makefile: remove *.spec from clean target
2018-02-28 13:37:58 -08:00
Junio C Hamano
14599b48c0 Merge branch 'rj/sparse-updates'
Devtool update.

* rj/sparse-updates:
  Makefile: suppress a sparse warning for pack-revindex.c
  config.mak.uname: remove SPARSE_FLAGS setting for cygwin
2018-02-27 10:33:55 -08:00
Todd Zullinger
4321bdcabb Makefile: remove *.spec from clean target
Support for generating an rpm was dropped in ab214331cf ("Makefile: stop
pretending to support rpmbuild", 2016-04-04).  We don't generate any
*.spec files so there is no need to clean them up.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-17 13:27:57 -08:00
Junio C Hamano
a66b51c624 Merge branch 'ab/sha1dc-build'
Push the submodule version of collision-detecting SHA-1 hash
implementation a bit harder on builders.

* ab/sha1dc-build:
  sha1dc_git.h: re-arrange an ifdef chain for a subsequent change
  Makefile: under "make dist", include the sha1collisiondetection submodule
  Makefile: don't error out under DC_SHA1_EXTERNAL if DC_SHA1_SUBMODULE=auto
2018-02-15 14:55:40 -08:00
SZEDER Gábor
2530afd351 Makefile: generate Git(3pm) as dependency of the 'doc' and 'man' targets
Since commit 20d2a30f8f (Makefile: replace perl/Makefile.PL with
simple make rules, 2017-12-10), the Git(3pm) man page is only
generated as an indirect dependency of the 'install-doc' and
'install-man' Makefile targets.  Consequently, if someone runs 'make
man && sudo make install-man' (or their 'doc' counterparts), then
Git(3pm) will be generated as root, and the resulting root-owned files
and directories will in turn cause the next user-run 'make clean' to
fail.  This was not an issue in the past, because Git(3pm) was
generated when 'make all' descended into 'perl/', which is usually not
run as root.

List Git(3pm) as a dependency of the 'doc' and 'man' Makefile targets,
too, so it gets generated by targets that are usually built as
ordinary users.

While at it, add 'install-man-perl' to the list of .PHONY targets.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-15 10:05:32 -08:00
Junio C Hamano
867622398f Merge branch 'gs/retire-mru'
Retire mru API as it does not give enough abstraction over
underlying list API to be worth it.

* gs/retire-mru:
  mru: Replace mru.[ch] with list.h implementation
2018-02-13 13:39:06 -08:00
Junio C Hamano
f3d618d2bf Merge branch 'jh/fsck-promisors'
In preparation for implementing narrow/partial clone, the machinery
for checking object connectivity used by gc and fsck has been
taught that a missing object is OK when it is referenced by a
packfile specially marked as coming from trusted repository that
promises to make them available on-demand and lazily.

* jh/fsck-promisors:
  gc: do not repack promisor packfiles
  rev-list: support termination at promisor objects
  sha1_file: support lazily fetching missing objects
  introduce fetch-object: fetch one promisor object
  index-pack: refactor writing of .keep files
  fsck: support promisor objects as CLI argument
  fsck: support referenced promisor objects
  fsck: support refs pointing to promisor objects
  fsck: introduce partialclone extension
  extension.partialclone: introduce partial clone extension
2018-02-13 13:39:03 -08:00
Junio C Hamano
ed1b87ef91 Merge branch 'ab/simplify-perl-makefile'
The build procedure for perl/ part has been greatly simplified by
weaning ourselves off of MakeMaker.

* ab/simplify-perl-makefile:
  perl: treat PERLLIB_EXTRA as an extra path again
  perl: avoid *.pmc and fix Error.pm further
  Makefile: replace perl/Makefile.PL with simple make rules
2018-02-13 13:39:03 -08:00
Ramsay Jones
54360a1956 Makefile: suppress a sparse warning for pack-revindex.c
Sparse has, for a long time, been issuing the following warning against
the pack-revindex.c file:

      SP pack-revindex.c
  pack-revindex.c:64:23: warning: memset with byte count of 262144

This results from a unconditional check, with a hard-coded limit, which
is really only appropriate for the kernel source code. (The check is for
a 'large' byte count in a call to memcpy(), memset(), copy_from_user()
and copy_to_user() functions).

A recent release of sparse (v0.5.1) has introduced some options to allow
this check to be turned off (-Wno-memcpy-max-count) or to specify the
actual limit used (-fmemcpy-max-count=COUNT), rather than a hard-coded
limit of 100000.

In order to suppress the warning, add a target for pack-revindex.sp that
adds the '-Wno-memcpy-max-count' option to the SPARSE_FLAGS variable.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-12 12:19:39 -08:00
Gargi Sharma
ec2dd32c70 mru: Replace mru.[ch] with list.h implementation
Replace the custom calls to mru.[ch] with calls to list.h. This patch is
the final step in removing the mru API completely and inlining the logic.
This patch leads to significant code reduction and the mru API hence, is
not a useful abstraction anymore.

Signed-off-by: Gargi Sharma <gs051095@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-24 09:52:16 -08:00
Junio C Hamano
a09a5e6c36 Merge branch 'ab/dc-sha1-loose-ends'
Tying loose ends for the recent integration work of
collision-detecting SHA-1 implementation.

* ab/dc-sha1-loose-ends:
  Makefile: NO_OPENSSL=1 should no longer imply BLK_SHA1=1
2018-01-09 14:32:53 -08:00
Junio C Hamano
07b747d324 Merge branch 'jk/test-suite-tracing'
Assorted fixes around running tests with "-x" tracing option.

* jk/test-suite-tracing:
  t/Makefile: introduce TEST_SHELL_PATH
  test-lib: make "-x" work with "--verbose-log"
  t5615: avoid re-using descriptor 4
  test-lib: silence "-x" cleanup under bash
2018-01-05 13:28:09 -08:00
Jonathan Nieder
7a7bfc7adc perl: treat PERLLIB_EXTRA as an extra path again
PERLLIB_EXTRA was introduced in v1.9-rc0~88^2 (2013-11-15) as a way
for packagers to add additional directories such as the location of
Subversion's perl bindings to Git's perl path.  Since 20d2a30f
(Makefile: replace perl/Makefile.PL with simple make rules,
2012-12-10) setting that variable breaks perl-based commands instead:

 $ PATH=$HOME/opt/git/bin:$PATH
 $ make install prefix=$HOME/opt/git PERLLIB_EXTRA=anextralibdir
[...]
 $ head -2 $HOME/opt/git/libexec/git-core/git-add--interactive
 #!/usr/bin/perl
 use lib (split(/:/, $ENV{GITPERLLIB} || ":helloiamanextrainstlibdir" || "/usr/local/google/home/jrn/opt/git/share/perl5"));
 $ git add -p
 Empty compile time value given to use lib at /home/jrn/opt/git/libexec/git-core/git-add--interactive line 2.

Removing the spurious ":" at the beginning of ":$PERLLIB_EXTRA" avoids
the "Empty compile time value" error but with that tweak the problem
still remains: PERLLIB_EXTRA ends up replacing instead of
supplementing the perllibdir that would be passed to 'use lib' if
PERLLIB_EXTRA were not set.

The intent was to simplify, as the commit message to 20d2a30f
explains:

| The scripts themselves will 'use lib' the target directory, but if
| INSTLIBDIR is set it overrides it. It doesn't have to be this way,
| it could be set in addition to INSTLIBDIR, but my reading of
| [v1.9-rc0~88^2] is that this is the desired behavior.

Restore the previous code structure to make PERLLIB_EXTRA work again.

Reproducing this problem requires an invocation of "make install"
instead of running bin-wrappers/git in place, since the latter sets
the GITPERLLIB environment variable, avoiding trouble.

Reported-by: Jonathan Koren <jdkoren@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-03 12:38:37 -08:00
Junio C Hamano
58d1772c85 Merge branch 'js/enhanced-version-info'
"git version --build-options" learned to report the host CPU and
the exact commit object name the binary was built from.

* js/enhanced-version-info:
  version --build-options: report commit, too, if possible
  version --build-options: also report host CPU
2017-12-28 14:08:47 -08:00
Ævar Arnfjörð Bjarmason
edb6a17c36 Makefile: NO_OPENSSL=1 should no longer imply BLK_SHA1=1
Use the collision detecting SHA-1 implementation by default even when
NO_OPENSSL is set.

Setting NO_OPENSSL=UnfortunatelyYes has implied BLK_SHA1=1 ever since
the former was introduced in dd53c7ab29 (Support for NO_OPENSSL,
2005-07-29).  That implication should have been removed when the
default SHA-1 implementation changed from OpenSSL to DC_SHA1 in
e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17).  Finish
what that commit started by removing the BLK_SHA1 fallback setting so
the default DC_SHA1 implementation will be used.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28 11:55:56 -08:00
Ævar Arnfjörð Bjarmason
805a378649 perl: avoid *.pmc and fix Error.pm further
The previous round tried to use *.pmc files but it confused RPM
dependency analysis on some distros.  Install them as plain
vanilla *.pm files instead.

Also "local @_" construct did not properly work when goto &sub
is used until recent versions of Perl.  Avoid it (and we do not
need to localize it here anyway).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28 10:35:21 -08:00
Junio C Hamano
eacf669cec Merge branch 'jt/decorate-api'
A few structures and variables that are implementation details of
the decorate API have been renamed and then the API got documented
better.

* jt/decorate-api:
  decorate: clean up and document API
2017-12-27 11:16:26 -08:00
Junio C Hamano
61061abba7 Merge branch 'jh/object-filtering'
In preparation for implementing narrow/partial clone, the object
walking machinery has been taught a way to tell it to "filter" some
objects from enumeration.

* jh/object-filtering:
  rev-list: support --no-filter argument
  list-objects-filter-options: support --no-filter
  list-objects-filter-options: fix 'keword' typo in comment
  pack-objects: add list-objects filtering
  rev-list: add list-objects filtering support
  list-objects: filter objects in traverse_commit_list
  oidset: add iterator methods to oidset
  oidmap: add oidmap iterator methods
  dir: allow exclusions from blob in addition to file
2017-12-27 11:16:21 -08:00
Junio C Hamano
66d3f19324 Merge branch 'tg/worktree-create-tracking'
The way "git worktree add" determines what branch to create from
where and checkout in the new worktree has been updated a bit.

* tg/worktree-create-tracking:
  add worktree.guessRemote config option
  worktree: add --guess-remote flag to add subcommand
  worktree: make add <path> <branch> dwim
  worktree: add --[no-]track option to the add subcommand
  worktree: add can be created from any commit-ish
  checkout: factor out functions to new lib file
2017-12-19 11:33:57 -08:00
Johannes Schindelin
ed32b788c0 version --build-options: report commit, too, if possible
In particular when local tags are used (or tags that are pushed to some
fork) to build Git, it is very hard to figure out from which particular
revision a particular Git executable was built. It gets worse when those
tags are deleted, or even updated.

Let's just report an exact, unabbreviated commit name in our build
options.

We need to be careful, though, to report when the current commit cannot
be determined, e.g. when building from a tarball without any associated
Git repository. This could be the case also when extracting Git's source
code into an unrelated Git worktree.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-14 22:53:04 -08:00
Eric Sunshine
b22894049f version --build-options: also report host CPU
It can be helpful for bug reports to include information about the
environment in which the bug occurs. "git version --build-options" can
help to supplement this information. In addition to the size of 'long'
already reported by --build-options, also report the host's CPU type.
Example output:

   $ git version --build-options
   git version 2.9.3.windows.2.826.g06c0f2f
   cpu: x86_64
   sizeof-long: 4

New Makefile variable HOST_CPU supports cross-compiling.

Suggested-by: Adric Norris <landstander668@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-14 22:52:49 -08:00
Ævar Arnfjörð Bjarmason
20d2a30f8f Makefile: replace perl/Makefile.PL with simple make rules
Replace the perl/Makefile.PL and the fallback perl/Makefile used under
NO_PERL_MAKEMAKER=NoThanks with a much simpler implementation heavily
inspired by how the i18n infrastructure's build process works[1].

The reason for having the Makefile.PL in the first place is that it
was initially[2] building a perl C binding to interface with libgit,
this functionality, that was removed[3] before Git.pm ever made it to
the master branch.

We've since since started maintaining a fallback perl/Makefile, as
MakeMaker wouldn't work on some platforms[4]. That's just the tip of
the iceberg. We have the PM.stamp hack in the top-level Makefile[5] to
detect whether we need to regenerate the perl/perl.mak, which I fixed
just recently to deal with issues like the perl version changing from
under us[6].

There is absolutely no reason for why this needs to be so complex
anymore. All we're getting out of this elaborate Rube Goldberg machine
was copying perl/* to perl/blib/* as we do a string-replacement on
the *.pm files to hardcode @@LOCALEDIR@@ in the source, as well as
pod2man-ing Git.pm & friends.

So replace the whole thing with something that's pretty much a copy of
how we generate po/build/**.mo from po/*.po, just with a small sed(1)
command instead of msgfmt. As that's being done rename the files
from *.pm to *.pmc just to indicate that they're generated (see
"perldoc -f require").

While I'm at it, change the fallback for Error.pm from being something
where we'll ship our own Error.pm if one doesn't exist at build time
to one where we just use a Git::Error wrapper that'll always prefer
the system-wide Error.pm, only falling back to our own copy if it
really doesn't exist at runtime. It's now shipped as
Git::FromCPAN::Error, making it easy to add other modules to
Git::FromCPAN::* in the future if that's needed.

Functional changes:

 * This will not always install into perl's idea of its global
   "installsitelib". This only potentially matters for packagers that
   need to expose Git.pm for non-git use, and as explained in the
   INSTALL file there's a trivial workaround.

 * The scripts themselves will 'use lib' the target directory, but if
   INSTLIBDIR is set it overrides it. It doesn't have to be this way,
   it could be set in addition to INSTLIBDIR, but my reading of [7] is
   that this is the desired behavior.

 * We don't build man pages for all of the perl modules as we used to,
   only Git(3pm). As discussed on-list[8] that we were building
   installed manpages for purely internal APIs like Git::I18N or
   private-Error.pm was always a bug anyway, and all the Git::SVN::*
   ones say they're internal APIs.

   There are apparently external users of Git.pm, but I don't expect
   there to be any of the others.

   As a side-effect of these general changes the perl documentation
   now only installed by install-{doc,man}, not a mere "install" as
   before.

1. 5e9637c629 ("i18n: add infrastructure for translating Git with
   gettext", 2011-11-18)

2. b1edc53d06 ("Introduce Git.pm (v4)", 2006-06-24)

3. 18b0fc1ce1 ("Git.pm: Kill Git.xs for now", 2006-09-23)

4. f848718a69 ("Make perl/ build procedure ActiveState friendly.",
   2006-12-04)

5. ee9be06770 ("perl: detect new files in MakeMaker builds",
   2012-07-27)

6. c59c4939c2 ("perl: regenerate perl.mak if perl -V changes",
   2017-03-29)

7. 0386dd37b1 ("Makefile: add PERLLIB_EXTRA variable that adds to
   default perl path", 2013-11-15)

8. 87bmjjv1pu.fsf@evledraar.booking.com ("Re: [PATCH] Makefile:
   replace perl/Makefile.PL with simple make rules"

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-11 15:28:10 -08:00
Ævar Arnfjörð Bjarmason
bc2ed316e4 Makefile: under "make dist", include the sha1collisiondetection submodule
Include the sha1collisiondetection submodule when running "make
dist". Even though we've been shipping the sha1collisiondetection
submodule[1] and using it by default if it's checked out[2] anyone
downloading git as a tarball would just get an empty
sha1collisiondetection/ directory.

Doing this automatically is a feature that's missing from git-archive,
but in the meantime let's bundle this up into the tarball we
ship. This ensures that the DC_SHA1_SUBMODULE flag does what's
intended even in an unpacked tarball, and more importantly means we're
building the exact same code from the same paths from git.git and from
the tarball.

I am not including all the files in the submodule, only the ones git
actually needs (and the licenses), only including some files like this
would be a useful feature if git-archive ever adds the ability to
bundle up submodules.

1. commit 86cfd61e6b ("sha1dc: optionally use sha1collisiondetection
   as a submodule", 2017-07-01)
2. cac87dc01d ("sha1collisiondetection: automatically enable when
   submodule is populated", 2017-07-01)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 15:01:00 -08:00
Ævar Arnfjörð Bjarmason
f39e05f225 Makefile: don't error out under DC_SHA1_EXTERNAL if DC_SHA1_SUBMODULE=auto
Fix a logic error in the initial introduction of DC_SHA1_EXTERNAL. If
git.git has a sha1collisiondetection submodule checked out the logic
to set DC_SHA1_SUBMODULE=auto would interact badly with the check for
whether DC_SHA1_SUBMODULE was set.

It would error out, meaning that there's no way to build git with
DC_SHA1_EXTERNAL=YesPlease without deinit-ing the submodule.

Instead, adjust the logic to only fire if the variable is to something
else than "auto" which would mean it's a mistake on the part of
whoever's building git, not just the Makefile tripping over its own
logic.

1. 3964cbbb5c ("sha1dc: allow building with the external sha1dc
   library", 2017-08-15)
2. cac87dc01d ("sha1collisiondetection: automatically enable when
   submodule is populated", 2017-07-01)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 15:00:59 -08:00
Jonathan Tan
ddd3e31242 decorate: clean up and document API
Improve the names of the identifiers in decorate.h, document them, and
add an example of how to use these functions.

The example is compiled and run as part of the test suite.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:16:27 -08:00
Jeff King
3f824e91c8 t/Makefile: introduce TEST_SHELL_PATH
You may want to run the test suite with a different shell
than you use to build Git. For instance, you may build with
SHELL_PATH=/bin/sh (because it's faster, or it's what you
expect to exist on systems where the build will be used) but
want to run the test suite with bash (e.g., since that
allows using "-x" reliably across the whole test suite).
There's currently no good way to do this.

You might think that doing two separate make invocations,
like:

  make &&
  make -C t SHELL_PATH=/bin/bash

would work. And it _almost_ does. The second make will see
our bash SHELL_PATH, and we'll use that to run the
individual test scripts (or tell prove to use it to do so).
So far so good.

But this breaks down when "--tee" or "--verbose-log" is
used. Those options cause the test script to actually
re-exec itself using $SHELL_PATH. But wait, wouldn't our
second make invocation have set SHELL_PATH correctly in the
environment?

Yes, but test-lib.sh sources GIT-BUILD-OPTIONS, which we
built during the first "make". And that overrides the
environment, giving us the original SHELL_PATH again.

Let's introduce a new variable that lets you specify a
specific shell to be run for the test scripts. Note that we
have to touch both the main and t/ Makefiles, since we have
to record it in GIT-BUILD-OPTIONS in one, and use it in the
latter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:03:38 -08:00
Junio C Hamano
4c6dad0059 Merge branch 'bw/protocol-v1'
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.

* bw/protocol-v1:
  Documentation: document Extra Parameters
  ssh: introduce a 'simple' ssh variant
  i5700: add interop test for protocol transition
  http: tell server that the client understands v1
  connect: tell server that the client understands v1
  connect: teach client to recognize v1 server response
  upload-pack, receive-pack: introduce protocol version 1
  daemon: recognize hidden request arguments
  protocol: introduce protocol extension mechanisms
  pkt-line: add packet_write function
  connect: in ref advertisement, shallows are last
2017-12-06 09:23:44 -08:00
Jonathan Tan
88e2f9ed8e introduce fetch-object: fetch one promisor object
Introduce fetch-object, providing the ability to fetch one object from a
promisor remote.

This uses fetch-pack. To do this, the transport mechanism has been
updated with 2 flags, "from-promisor" to indicate that the resulting
pack comes from a promisor remote (and thus should be annotated as such
by index-pack), and "no-dependents" to indicate that only the objects
themselves need to be fetched (but fetching additional objects is
nevertheless safe).

Whenever "no-dependents" is used, fetch-pack will refrain from using any
object flags, because it is most likely invoked as part of a dynamic
object fetch by another Git command (which may itself use object flags).
An alternative to this is to leave fetch-pack alone, and instead update
the allocation of flags so that fetch-pack's flags never overlap with
any others, but this will end up shrinking the number of flags available
to nearly every other Git command (that is, every Git command that
accesses objects), so the approach in this commit was used instead.

This will be tested in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05 09:46:05 -08:00
Thomas Gummerer
7c85a87c54 checkout: factor out functions to new lib file
Factor the functions out, so they can be re-used from other places.  In
particular these functions will be re-used in builtin/worktree.c to make
git worktree add dwim more.

While there add some docs to the function.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-27 09:48:06 +09:00
Jeff Hostetler
25ec7bcac0 list-objects: filter objects in traverse_commit_list
Create traverse_commit_list_filtered() and add filtering
interface to allow certain objects to be omitted from the
traversal.

Update traverse_commit_list() to be a wrapper for the above
with a null filter to minimize the number of callers that
needed to be changed.

Object filtering will be used in a future commit by rev-list
and pack-objects for partial clone and fetch to omit unwanted
objects from the result.

traverse_bitmap_commit_list() does not work with filtering.
If a packfile bitmap is present, it will not be used.  It
should be possible to extend such support in the future (at
least to simple filters that do not require object pathnames),
but that is beyond the scope of this patch series.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 14:11:57 +09:00
Junio C Hamano
e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
Junio C Hamano
40bc898103 Merge branch 'js/mingw-full-version-in-resources' into maint
MinGW updates.

* js/mingw-full-version-in-resources:
  mingw: include the full version information in the resources
2017-11-15 12:05:03 +09:00
Junio C Hamano
d3e32dc90c Merge branch 'js/mingw-full-version-in-resources'
MinGW updates.

* js/mingw-full-version-in-resources:
  mingw: include the full version information in the resources
2017-11-09 14:31:31 +09:00
Johannes Schindelin
39bb86b4e5 mingw: include the full version information in the resources
This fixes https://github.com/git-for-windows/git/issues/723

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01 13:43:52 +09:00
Brandon Williams
373d70efb2 protocol: introduce protocol extension mechanisms
Create protocol.{c,h} and provide functions which future servers and
clients can use to determine which protocol to use or is being used.

Also introduce the 'GIT_PROTOCOL' environment variable which will be
used to communicate a colon separated list of keys with optional values
to a server.  Unknown keys and values must be tolerated.  This mechanism
is used to communicate which version of the wire protocol a client would
like to use with a server.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17 10:51:29 +09:00
Junio C Hamano
54bd705a95 Merge branch 'jt/oidmap'
Introduce a new "oidmap" API and rewrite oidset to use it.

* jt/oidmap:
  oidmap: map with OID as key
2017-10-11 14:52:22 +09:00
Junio C Hamano
1a2e1a76ec Merge branch 'mh/mmap-packed-refs'
Operations that do not touch (majority of) packed refs have been
optimized by making accesses to packed-refs file lazy; we no longer
pre-parse everything, and an access to a single ref in the
packed-refs does not touch majority of irrelevant refs, either.

* mh/mmap-packed-refs: (21 commits)
  packed-backend.c: rename a bunch of things and update comments
  mmapped_ref_iterator: inline into `packed_ref_iterator`
  ref_cache: remove support for storing peeled values
  packed_ref_store: get rid of the `ref_cache` entirely
  ref_store: implement `refs_peel_ref()` generically
  packed_read_raw_ref(): read the reference from the mmapped buffer
  packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
  read_packed_refs(): ensure that references are ordered when read
  packed_ref_cache: keep the `packed-refs` file mmapped if possible
  packed-backend.c: reorder some definitions
  mmapped_ref_iterator_advance(): no peeled value for broken refs
  mmapped_ref_iterator: add iterator over a packed-refs file
  packed_ref_cache: remember the file-wide peeling state
  read_packed_refs(): read references with minimal copying
  read_packed_refs(): make parsing of the header line more robust
  read_packed_refs(): only check for a header at the top of the file
  read_packed_refs(): use mmap to read the `packed-refs` file
  die_unterminated_line(), die_invalid_line(): new functions
  packed_ref_cache: add a backlink to the associated `packed_ref_store`
  prefix_ref_iterator: break when we leave the prefix
  ...
2017-10-03 15:42:50 +09:00
Ben Peart
14527b3002 fsmonitor: add a performance test
Add a test utility (test-drop-caches) that flushes all changes to disk
then drops file system cache on Windows, Linux, and OSX.

Add a perf test (p7519-fsmonitor.sh) for fsmonitor.

By default, the performance test will utilize the Watchman file system
monitor if it is installed.  If Watchman is not installed, it will use a
dummy integration script that does not report any new or modified files.
The dummy script has very little overhead which provides optimistic results.

The performance test will also use the untracked cache feature if it is
available as fsmonitor uses it to speed up scanning for untracked files.

There are 4 environment variables that can be used to alter the default
behavior of the performance test:

GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
GIT_PERF_7519_FSMONITOR: used to configure core.fsmonitor
GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests

The big win for using fsmonitor is the elimination of the need to scan the
working directory looking for changed and untracked files. If the file
information is all cached in RAM, the benefits are reduced.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:05 +09:00
Ben Peart
dd3551f491 fsmonitor: add a test tool to dump the index extension
Add a test utility (test-dump-fsmonitor) that will dump the fsmonitor
index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:05 +09:00
Ben Peart
883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Jonathan Tan
9e6fabde82 oidmap: map with OID as key
This is similar to using the hashmap in hashmap.c, but with an
easier-to-use API. In particular, custom entry comparisons no longer
need to be written, and lookups can be done without constructing a
temporary entry structure.

This is implemented as a thin wrapper over the hashmap API. In
particular, this means that there is an additional 4-byte overhead due
to the fact that the first 4 bytes of the hash is redundantly stored.
For now, I'm taking the simpler approach, but if need be, we can
reimplement oidmap without affecting the callers significantly.

oidset has been updated to use oidmap.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:18:03 +09:00
Michael Haggerty
5b633610ec packed_ref_cache: keep the packed-refs file mmapped if possible
Keep a copy of the `packed-refs` file contents in memory for as long
as a `packed_ref_cache` object is in use:

* If the system allows it, keep the `packed-refs` file mmapped.

* If not (either because the system doesn't support `mmap()` at all,
  or because a file that is currently mmapped cannot be replaced via
  `rename()`), then make a copy of the file's contents in
  heap-allocated space, and keep that around instead.

We base the choice of behavior on a new build-time switch,
`MMAP_PREVENTS_DELETE`. By default, this switch is set for Windows
variants.

After this commit, `MMAP_NONE` and `MMAP_TEMPORARY` are still handled
identically. But the next commit will introduce a difference.

This whole change is still pointless, because we only read the
`packed-refs` file contents immediately after instantiating the
`packed_ref_cache`. But that will soon change.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Junio C Hamano
a36f631ad6 Merge branch 'bw/git-clang-format'
"make style" runs git-clang-format to help developers by pointing
out coding style issues.

* bw/git-clang-format:
  Makefile: add style build rule
  clang-format: outline the git project's coding style
2017-09-25 15:24:08 +09:00
Junio C Hamano
09595ab381 Merge branch 'jk/leak-checkers'
Many of our programs consider that it is OK to release dynamic
storage that is used throughout the life of the program by simply
exiting, but this makes it harder to leak detection tools to avoid
reporting false positives.  Plug many existing leaks and introduce
a mechanism for developers to mark that the region of memory
pointed by a pointer is not lost/leaking to help these tools.

* jk/leak-checkers:
  add UNLEAK annotation for reducing leak false positives
  set_git_dir: handle feeding gitdir to itself
  repository: free fields before overwriting them
  reset: free allocated tree buffers
  reset: make tree counting less confusing
  config: plug user_config leak
  update-index: fix cache entry leak in add_one_file()
  add: free leaked pathspec after add_files_to_cache()
  test-lib: set LSAN_OPTIONS to abort by default
  test-lib: --valgrind should not override --verbose-log
2017-09-19 10:47:55 +09:00
Junio C Hamano
cb6ec86d29 Merge branch 'ti/external-sha1dc'
Platforms that ship with a separate sha1 with collision detection
library can link to it instead of using the copy we ship as part of
our source tree.

* ti/external-sha1dc:
  sha1dc: allow building with the external sha1dc library
  sha1dc: build git plumbing code more explicitly
2017-09-19 10:47:50 +09:00
Junio C Hamano
8134746d1d Merge branch 'jn/vcs-svn-cleanup' into maint
Code clean-up.

* jn/vcs-svn-cleanup:
  vcs-svn: move remaining repo_tree functions to fast_export.h
  vcs-svn: remove repo_delete wrapper function
  vcs-svn: remove custom mode constants
  vcs-svn: remove more unused prototypes and declarations
2017-09-10 17:03:09 +09:00
Jeff King
0e5bba53af add UNLEAK annotation for reducing leak false positives
It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.

This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.

This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.

Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:

  1. Which blocks were allocated via malloc, and the
     callstack during the allocation.

  2. Which blocks were left un-freed at the end of the
     program (and which are unreachable, but more on that
     later).

Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak.  So imagine you have code like this:

  int cmd_foo(...)
  {
	/* this allocates some memory */
	char *p = some_function();
	printf("%s", p);
	return 0;
  }

You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.

So you can say "ignore the callstack when main calls
some_function".  That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.

What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.

However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run.  Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).

So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).

Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.

In other words, you can do:

  int cmd_foo(...)
  {
	char *p = some_function();
	printf("%s", p);
	UNLEAK(p);
	return 0;
  }

to annotate "p" and suppress the leak report.

But wait, couldn't we just say "free(p)"? In this toy
example, yes. But UNLEAK()'s byte-copying strategy has
several advantages over actually freeing the memory:

  1. It's recursive across structures. In many cases our "p"
     is not just a pointer, but a complex struct whose
     fields may have been allocated by a sub-function. And
     in some cases (e.g., dir_struct) we don't even have a
     function which knows how to free all of the struct
     members.

     By marking the struct itself as reachable, that confers
     reachability on any pointers it contains (including those
     found in embedded structs, or reachable by walking
     heap blocks recursively.

  2. It works on cases where we're not sure if the value is
     allocated or not. For example:

       char *p = argc > 1 ? argv[1] : some_function();

     It's safe to use UNLEAK(p) here, because it's not
     freeing any memory. In the case that we're pointing to
     argv here, the reachability checker will just ignore
     our bytes.

  3. Likewise, it works even if the variable has _already_
     been freed. We're just copying the pointer bytes. If
     the block has been freed, the leak-checker will skip
     over those bytes as uninteresting.

  4. Because it's not actually freeing memory, you can
     UNLEAK() before we are finished accessing the variable.
     This is helpful in cases like this:

       char *p = some_function();
       return another_function(p);

     Writing this with free() requires:

       int ret;
       char *p = some_function();
       ret = another_function(p);
       free(p);
       return ret;

     But with unleak we can just write:

       char *p = some_function();
       UNLEAK(p);
       return another_function(p);

This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak.  In
normal builds it's a noop, so we pay no runtime cost.

It also adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.

Note the case in commit.c which actually converts a
strbuf_release() into an UNLEAK. This code was already
non-leaky, but the free didn't do anything useful, since
we're exiting. Converting it to an annotation means that
non-leak-checking builds pay no runtime cost. The cost is
minimal enough that it's probably not worth going on a
crusade to convert these kinds of frees to UNLEAKS. I did it
here for consistency with the "sb" leak (though it would
have been equally correct to go the other way, and turn them
both into strbuf_release() calls).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-08 15:43:17 +09:00
Junio C Hamano
eabdcd4ab4 Merge branch 'jt/packmigrate'
Code movement to make it easier to hack later.

* jt/packmigrate: (23 commits)
  pack: move for_each_packed_object()
  pack: move has_pack_index()
  pack: move has_sha1_pack()
  pack: move find_pack_entry() and make it global
  pack: move find_sha1_pack()
  pack: move find_pack_entry_one(), is_pack_valid()
  pack: move check_pack_index_ptr(), nth_packed_object_offset()
  pack: move nth_packed_object_{sha1,oid}
  pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
  pack: move unpack_object_header()
  pack: move get_size_from_delta()
  pack: move unpack_object_header_buffer()
  pack: move {,re}prepare_packed_git and approximate_object_count
  pack: move install_packed_git()
  pack: move add_packed_git()
  pack: move unuse_pack()
  pack: move use_pack()
  pack: move pack-closing functions
  pack: move release_pack_memory()
  pack: move open_pack_index(), parse_pack_index()
  ...
2017-08-26 22:55:09 -07:00
Junio C Hamano
030faf2fa5 Merge branch 'kw/write-index-reduce-alloc'
We used to spend more than necessary cycles allocating and freeing
piece of memory while writing each index entry out.  This has been
optimized.

* kw/write-index-reduce-alloc:
  read-cache: avoid allocating every ondisk entry when writing
  read-cache: fix memory leak in do_write_index
  perf: add test for writing the index
2017-08-26 22:55:08 -07:00
Junio C Hamano
18c88f9af6 Merge branch 'jn/vcs-svn-cleanup'
Code clean-up.

* jn/vcs-svn-cleanup:
  vcs-svn: move remaining repo_tree functions to fast_export.h
  vcs-svn: remove repo_delete wrapper function
  vcs-svn: remove custom mode constants
  vcs-svn: remove more unused prototypes and declarations
2017-08-26 22:55:06 -07:00
Jonathan Tan
4f39cd821d pack: move pack name-related functions
Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.

Create a new file, packfile.c, to hold all packfile-related functions
currently in sha1_file.c. It has a corresponding header packfile.h.

In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:06 -07:00
Jonathan Nieder
b8f43b120b vcs-svn: move remaining repo_tree functions to fast_export.h
These used to be for manipulating the in-memory repo_tree structure,
but nowadays they are convenience wrappers to handle a few git-vs-svn
mismatches:

 1. Git does not track empty directories but Subversion does.  When
    looking up a path in git that Subversion thinks exists and finding
    nothing, we can safely assume that the path represents a
    directory.  This is needed when a later Subversion revision
    modifies that directory.

 2. Subversion allows deleting a file by copying.  In Git fast-import
    we have to handle that more explicitly as a deletion.

These are details of the tool's interaction with git fast-import.
Move them to fast_export.c, where other such details are handled.

This way the function names do not start with a repo_ prefix that
would clash with the repository object introduced in
v2.14.0-rc0~38^2~16 (repository: introduce the repository object,
2017-06-22) or an svn_ prefix that would clash with libsvn (in case
someone wants to link this code with libsvn some day).

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 10:41:26 -07:00
Junio C Hamano
44c2339e55 Merge branch 'mh/packed-ref-store'
The "ref-store" code reorganization continues.

* mh/packed-ref-store: (32 commits)
  files-backend: cheapen refname_available check when locking refs
  packed_ref_store: handle a packed-refs file that is a symlink
  read_packed_refs(): die if `packed-refs` contains bogus data
  t3210: add some tests of bogus packed-refs file contents
  repack_without_refs(): don't lock or unlock the packed refs
  commit_packed_refs(): remove call to `packed_refs_unlock()`
  clear_packed_ref_cache(): don't protest if the lock is held
  packed_refs_unlock(), packed_refs_is_locked(): new functions
  packed_refs_lock(): report errors via a `struct strbuf *err`
  packed_refs_lock(): function renamed from lock_packed_refs()
  commit_packed_refs(): use a staging file separate from the lockfile
  commit_packed_refs(): report errors rather than dying
  packed_ref_store: make class into a subclass of `ref_store`
  packed-backend: new module for handling packed references
  packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`
  packed_ref_store: support iteration
  packed_peel_ref(): new function, extracted from `files_peel_ref()`
  repack_without_refs(): take a `packed_ref_store *` parameter
  get_packed_ref(): take a `packed_ref_store *` parameter
  rollback_packed_refs(): take a `packed_ref_store *` parameter
  ...
2017-08-22 10:29:16 -07:00
Kevin Willford
3921a0b3c3 perf: add test for writing the index
A performance test for writing the index to be able to
determine if changes to allocating ondisk structure help.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-21 15:56:53 -07:00
Takashi Iwai
3964cbbb5c sha1dc: allow building with the external sha1dc library
Some distros provide SHA1 collision-detect code as a shared library.
It's the same code as we have in git tree (but may be with a different
init default for hash), and git can link with it as well; at least, it
may make maintenance easier, according to our security guys.

This patch allows user to build git linking with the external sha1dc
library instead of the built-in code.  User needs to define
DC_SHA1_EXTERNAL explicitly.  As default without it, the built-in
sha1dc code is used like before.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-16 14:44:25 -07:00
Takashi Iwai
36f048c5e4 sha1dc: build git plumbing code more explicitly
The plumbing code between sha1dc and git is defined in
sha1dc_git.[ch], but these aren't compiled / included directly but
only via the indirect inclusion from sha1dc code.  This is slightly
confusing when you try to trace the build flow.

This patch brings the following changes for simplification:

  - Make sha1dc_git.c stand-alone and build from Makefile

  - sha1dc_git.h is the common header to include further sha1.h
    depending on the build condition

  - Move comments for plumbing codes from the header to definitions

This is also meant as a preliminary work for further plumbing with
external sha1dc shlib.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-16 14:43:59 -07:00
Brandon Williams
2118805b92 Makefile: add style build rule
Add the 'style' build rule which will run git-clang-format on the diff
between HEAD and the current worktree.  The result is a diff of
suggested changes.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-14 15:26:22 -07:00
Junio C Hamano
a491307448 Merge branch 'jc/po-pritime-fix'
We started using "%" PRItime, imitating "%" PRIuMAX and friends, as
a way to format the internal timestamp value, but this does not
play well with gettext(1) i18n framework, and causes "make pot"
that is run by the l10n coordinator to create a broken po/git.pot
file.  This is a possible workaround for that problem.

* jc/po-pritime-fix:
  Makefile: help gettext tools to cope with our custom PRItime format
2017-07-21 14:57:37 -07:00
Junio C Hamano
e4efb39555 Merge branch 'jk/build-with-asan'
A recent update made it easier to use "-fsanitize=" option while
compiling but supported only one sanitize option.  Allow more than
one to be combined, joined with a comma, like "make SANITIZE=foo,bar".

* jk/build-with-asan:
  Makefile: allow combining UBSan with other sanitizers
2017-07-20 16:29:59 -07:00
Junio C Hamano
fc0fd5b23b Makefile: help gettext tools to cope with our custom PRItime format
We started using our own timestamp_t type and PRItime format
specifier to go along with it, so that we can later change the
underlying type and output format more easily, but this does not
play well with gettext tools.

Because gettext tools need to keep the *.po file portable across
platforms, they have to special-case the format specifiers like
PRIuMAX that are known types in inttypes.h, instead of letting CPP
handle strings like

    "%" PRIuMAX " seconds ago"

as an ordinary string concatenation.  They fundamentally cannot do
the same for our own custom type/format.

Given that po/git.pot needs to be generated only once every release
and by only one person, i.e. the l10n coordinator, let's update the
Makefile rule to generate po/git.pot so that gettext tools are run
on a munged set of sources in which all mentions of PRItime are
replaced with PRIuMAX, which is what we happen to use right now.

This way, developers do not have to care that PRItime does not play
well with gettext, and translators do not have to care that we use
our own PRItime.

The credit for the idea to munge the source files goes to Dscho.
Possible bugs are mine.

Helped-by: Jiang Xin <worldhello.net@gmail.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-20 12:21:18 -07:00
René Scharfe
425ca6710b Makefile: allow combining UBSan with other sanitizers
Multiple sanitizers can be specified as a comma-separated list.  Set
the flag NO_UNALIGNED_LOADS even if UndefinedBehaviorSanitizer is not
the only sanitizer to build with.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:50:27 -07:00
Junio C Hamano
757e9874be Merge branch 'jk/build-with-asan'
The build procedure has been improved to allow building and testing
Git with address sanitizer more easily.

* jk/build-with-asan:
  Makefile: disable unaligned loads with UBSan
  Makefile: turn off -fomit-frame-pointer with sanitizers
  Makefile: add helper for compiling with -fsanitize
  test-lib: turn on ASan abort_on_error by default
  test-lib: set ASAN_OPTIONS variable before we run git
2017-07-13 16:14:54 -07:00
Junio C Hamano
2db87328ef Merge branch 'ab/sha1dc'
The "collission-detecting" implementation of SHA-1 hash we borrowed
from is replaced by directly binding the upstream project as our
submodule.  Glitches on minority platforms are still being worked out.

* ab/sha1dc:
  sha1collisiondetection: automatically enable when submodule is populated
  sha1dc: optionally use sha1collisiondetection as a submodule
2017-07-10 13:42:51 -07:00
Jeff King
566cf0b3bd Makefile: disable unaligned loads with UBSan
The undefined behavior sanitizer complains about unaligned
loads, even if they're OK for a particular platform in
practice. It's possible that they _are_ a problem, of
course, but since it's a known tradeoff the UBSan errors are
just noise.

Let's quiet it automatically by building with
NO_UNALIGNED_LOADS when SANITIZE=undefined is in use.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10 10:02:31 -07:00
Jeff King
ddbc8a6d3e Makefile: turn off -fomit-frame-pointer with sanitizers
The ASan manual recommends disabling this optimization, as
it can make the backtraces produced by the tool harder to
follow (and since this is a test-debug build, we don't care
about squeezing out every last drop of performance).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10 10:02:30 -07:00
Jeff King
56b5db30d0 Makefile: add helper for compiling with -fsanitize
You can already build and test with ASan by doing:

  make CFLAGS=-fsanitize=address test

but there are a few slight annoyances:

  1. It's a little long to type.

  2. It override your CFLAGS completely. You'd probably
     still want -O2, for instance.

  3. It's a good idea to also turn off "recovery", which
     lets the program keep running after a problem is
     detected (with the intention of finding as many bugs as
     possible in a given run). Since Git's test suite should
     generally run without triggering any problems, it's
     better to abort immediately and fail the test when we
     do find an issue.

With this patch, all of that happens automatically when you
run:

  make SANITIZE=address test

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10 10:02:29 -07:00
Junio C Hamano
85ce4a6828 Merge branch 'bw/repo-object'
Introduce a "repository" object to eventually make it easier to
work in multiple repositories (the primary focus is to work with
the superproject and its submodules) in a single process.

* bw/repo-object:
  ls-files: use repository object
  repository: enable initialization of submodules
  submodule: convert is_submodule_initialized to work on a repository
  submodule: add repo_read_gitmodules
  submodule-config: store the_submodule_cache in the_repository
  repository: add index_state to struct repo
  config: read config from a repository object
  path: add repo_worktree_path and strbuf_repo_worktree_path
  path: add repo_git_path and strbuf_repo_git_path
  path: worktree_git_path() should not use file relocation
  path: convert do_git_path to take a 'struct repository'
  path: convert strbuf_git_common_path to take a 'struct repository'
  path: always pass in commondir to update_common_dir
  path: create path.h
  environment: store worktree in the_repository
  environment: place key repository state in the_repository
  repository: introduce the repository object
  environment: remove namespace_len variable
  setup: add comment indicating a hack
  setup: don't perform lazy initialization of repository state
2017-07-05 13:32:56 -07:00
Junio C Hamano
cac87dc01d sha1collisiondetection: automatically enable when submodule is populated
If a user wants to experiment with the version of collision
detecting sha1 from the submodule, the user needed to not just
populate the submodule but also needed to turn the knob.

A Makefile trick is easy enough to do so, so let's do this.  When
somebody with a copy of the submodule populated wants not to use it,
that can be done by overriding it in config.mak or from the command
line.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:09:37 -07:00
Ævar Arnfjörð Bjarmason
86cfd61e6b sha1dc: optionally use sha1collisiondetection as a submodule
Add an option to use the sha1collisiondetection library from the
submodule in sha1collisiondetection/ instead of in the copy in the
sha1dc/ directory.

This allows us to try out the submodule in sha1collisiondetection
without breaking the build for anyone who's not expecting them as we
work out any kinks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:09:34 -07:00
Brandon Williams
359efeffc1 repository: introduce the repository object
Introduce the repository object 'struct repository' which can be used to
hold all state pertaining to a git repository.

Some of the benefits of object-ifying a repository are:

  1. Make the code base more readable and easier to reason about.

  2. Allow for working on multiple repositories, specifically
     submodules, within the same process.  Currently the process for
     working on a submodule involves setting up an argv_array of options
     for a particular command and then launching a child process to
     execute the command in the context of the submodule.  This is
     clunky and can require lots of little hacks in order to ensure
     correctness.  Ideally it would be nice to simply pass a repository
     and an options struct to a command.

  3. Eliminating reliance on global state will make it easier to
     enable the use of threading to improve performance.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:24:34 -07:00
Michael Haggerty
67be7c5a59 packed-backend: new module for handling packed references
Now that the interface between `files_ref_store` and
`packed_ref_store` is relatively narrow, move the latter into a new
module, "refs/packed-backend.h" and "refs/packed-backend.c". It still
doesn't quite implement the `ref_store` interface, but it will soon.

This commit moves code around and adjusts its visibility, but doesn't
change anything.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:32 -07:00
Junio C Hamano
df7fd961a9 Merge branch 'nd/fopen-errors'
Hotfix for a topic that is already in 'master'.

* nd/fopen-errors:
  configure.ac: loosen FREAD_READS_DIRECTORIES test program
2017-06-22 14:15:20 -07:00
Junio C Hamano
ae7e4d4fed Merge branch 'ab/pcre-v2'
Update "perl-compatible regular expression" support to enable JIT
and also allow linking with the newer PCRE v2 library.

* ab/pcre-v2:
  grep: add support for PCRE v2
  grep: un-break building with PCRE >= 8.32 without --enable-jit
  grep: un-break building with PCRE < 8.20
  grep: un-break building with PCRE < 8.32
  grep: add support for the PCRE v1 JIT API
  log: add -P as a synonym for --perl-regexp
  grep: skip pthreads overhead when using one thread
  grep: don't redundantly compile throwaway patterns under threading
2017-06-19 12:38:43 -07:00
Jeff King
3adf9fdecf configure.ac: loosen FREAD_READS_DIRECTORIES test program
We added an FREAD_READS_DIRECTORIES Makefile knob long ago
in cba22528f (Add compat/fopen.c which returns NULL on
attempt to open directory, 2008-02-08) to handle systems
where reading from a directory returned garbage. This works
by catching the problem at the fopen() stage and returning
NULL.

More recently, we found that there is a class of systems
(including Linux) where fopen() succeeds but fread() fails.
Since the solution is the same (having fopen return NULL),
they use the same Makefile knob as of e2d90fd1c
(config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and
FreeBSD, 2017-05-03).

This works fine except for one thing: the autoconf test in
configure.ac to set FREAD_READS_DIRECTORIES actually checks
whether fread succeeds. Which means that on Linux systems,
the knob isn't set (and we even override the config.mak.uname
default). t1308 catches the failure.

We can fix this by tweaking the autoconf test to cover both
cases. In theory we might care about the distinction between
the traditional "fread reads directories" case and the new
"fopen opens directories". But since our solution catches
the problem at the fopen stage either way, we don't actually
need to know the difference. The "fopen" case is a superset.

This does mean the FREAD_READS_DIRECTORIES name is slightly
misleading. Probably FOPEN_OPENS_DIRECTORIES would be more
accurate. But it would be disruptive to simply change the
name (people's existing build configs would fail), and it's
not worth the complexity of handling both. Let's just add a
comment in the knob description.

Reported-by: Øyvind A. Holm <sunny@sunbase.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 14:14:33 -07:00
Junio C Hamano
583c6a2295 Merge branch 'js/blame-lib'
The internal logic used in "git blame" has been libified to make it
easier to use by cgit.

* js/blame-lib: (29 commits)
  blame: move entry prepend to libgit
  blame: move scoreboard setup to libgit
  blame: move scoreboard-related methods to libgit
  blame: move fake-commit-related methods to libgit
  blame: move origin-related methods to libgit
  blame: move core structures to header
  blame: create entry prepend function
  blame: create scoreboard setup function
  blame: create scoreboard init function
  blame: rework methods that determine 'final' commit
  blame: wrap blame_sort and compare_blame_final
  blame: move progress updates to a scoreboard callback
  blame: make sanity_check use a callback in scoreboard
  blame: move no_whole_file_rename flag to scoreboard
  blame: move xdl_opts flags to scoreboard
  blame: move show_root flag to scoreboard
  blame: move reverse flag to scoreboard
  blame: move contents_from to scoreboard
  blame: move copy/move thresholds to scoreboard
  blame: move stat counters to scoreboard
  ...
2017-06-05 09:18:12 +09:00
Junio C Hamano
2281b8a362 Merge branch 'ab/sha1dc-maint'
The "collision detecting" SHA-1 implementation shipped with 2.13
was quite broken on some big-endian platforms and/or platforms that
do not like unaligned fetches.  Update to the upstream code which
has already fixed these issues.

* ab/sha1dc-maint:
  sha1dc: update from upstream
2017-06-04 09:55:41 +09:00
Junio C Hamano
36dcb57337 Merge branch 'ab/grep-preparatory-cleanup'
The internal implementation of "git grep" has seen some clean-up.

* ab/grep-preparatory-cleanup: (31 commits)
  grep: assert that threading is enabled when calling grep_{lock,unlock}
  grep: given --threads with NO_PTHREADS=YesPlease, warn
  pack-objects: fix buggy warning about threads
  pack-objects & index-pack: add test for --threads warning
  test-lib: add a PTHREADS prerequisite
  grep: move is_fixed() earlier to avoid forward declaration
  grep: change internal *pcre* variable & function names to be *pcre1*
  grep: change the internal PCRE macro names to be PCRE1
  grep: factor test for \0 in grep patterns into a function
  grep: remove redundant regflags assignments
  grep: catch a missing enum in switch statement
  perf: add a comparison test of log --grep regex engines with -F
  perf: add a comparison test of log --grep regex engines
  perf: add a comparison test of grep regex engines with -F
  perf: add a comparison test of grep regex engines
  perf: emit progress output when unpacking & building
  perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
  grep: add tests to fix blind spots with \0 patterns
  grep: prepare for testing binary regexes containing rx metacharacters
  grep: add a test helper function for less verbose -f \0 tests
  ...
2017-06-02 15:06:06 +09:00
Ævar Arnfjörð Bjarmason
94da9193a6 grep: add support for PCRE v2
Add support for v2 of the PCRE API. This is a new major version of
PCRE that came out in early 2015[1].

The regular expression syntax is the same, but while the API is
similar, pretty much every function is either renamed or takes
different arguments. Thus using it via entirely new functions makes
sense, as opposed to trying to e.g. have one compile_pcre_pattern()
that would call either PCRE v1 or v2 functions.

Git can now be compiled with either USE_LIBPCRE1=YesPlease or
USE_LIBPCRE2=YesPlease, with USE_LIBPCRE=YesPlease currently being a
synonym for the former. Providing both is a compile-time error.

With earlier patches to enable JIT for PCRE v1 the performance of the
release versions of both libraries is almost exactly the same, with
PCRE v2 being around 1% slower.

However after I reported this to the pcre-dev mailing list[2] I got a
lot of help with the API use from Zoltán Herczeg, he subsequently
optimized some of the JIT functionality in v2 of the library.

Running the p7820-grep-engines.sh performance test against the latest
Subversion trunk of both, with both them and git compiled as -O3, and
the test run against linux.git, gives the following results. Just the
/perl/ tests shown:

    $ GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_COMMAND='grep -q LIBPCRE2 Makefile && make -j8 USE_LIBPCRE2=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre2/inst LDFLAGS=-Wl,-rpath,/home/avar/g/pcre2/inst/lib || make -j8 USE_LIBPCRE=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre/inst LDFLAGS=-Wl,-rpath,/home/avar/g/pcre/inst/lib' ./run HEAD~5 HEAD~ HEAD p7820-grep-engines.sh
    [...]
    Test                                            HEAD~5            HEAD~                    HEAD
    -----------------------------------------------------------------------------------------------------------------
    7820.3: perl grep 'how.to'                      0.31(1.10+0.48)   0.21(0.35+0.56) -32.3%   0.21(0.34+0.55) -32.3%
    7820.7: perl grep '^how to'                     0.56(2.70+0.40)   0.24(0.64+0.52) -57.1%   0.20(0.28+0.60) -64.3%
    7820.11: perl grep '[how] to'                   0.56(2.66+0.38)   0.29(0.95+0.45) -48.2%   0.23(0.45+0.54) -58.9%
    7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       1.02(5.77+0.42)   0.31(1.02+0.54) -69.6%   0.23(0.50+0.54) -77.5%
    7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.38(1.57+0.42)   0.27(0.85+0.46) -28.9%   0.21(0.33+0.57) -44.7%

See commit ("perf: add a comparison test of grep regex engines",
2017-04-19) for details on the machine the above test run was executed
on.

Here HEAD~2 is git with PCRE v1 without JIT, HEAD~ is PCRE v1 with
JIT, and HEAD is PCRE v2 (also with JIT). See previous commits of mine
mentioning p7820-grep-engines.sh for more details on the test setup.

For ease of readability, a different run just of HEAD~ (PCRE v1 with
JIT v.s. PCRE v2), again with just the /perl/ tests shown:

    [...]
    Test                                            HEAD~             HEAD
    ----------------------------------------------------------------------------------------
    7820.3: perl grep 'how.to'                      0.21(0.42+0.52)   0.21(0.31+0.58) +0.0%
    7820.7: perl grep '^how to'                     0.25(0.65+0.50)   0.20(0.31+0.57) -20.0%
    7820.11: perl grep '[how] to'                   0.30(0.90+0.50)   0.23(0.46+0.53) -23.3%
    7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       0.30(1.19+0.38)   0.23(0.51+0.51) -23.3%
    7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.27(0.84+0.48)   0.21(0.34+0.57) -22.2%

I.e. the two are either neck-to-neck, but PCRE v2 usually pulls ahead,
when it does it's around 20% faster.

A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3)
the compiled pattern can be shared between threads, but not some of
the JIT context, however the grep threading support does all pattern &
JIT compilation in separate threads, so this code doesn't need to
concern itself with thread safety.

See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the
initial addition of PCRE v1. This change follows some of the same
patterns it did (and which were discussed on list at the time),
e.g. mocking up types with typedef instead of ifdef-ing them out when
USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the
program, but makes the code look nicer.

1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html
2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 08:29:05 +09:00
Ævar Arnfjörð Bjarmason
fb95e2e38d grep: un-break building with PCRE >= 8.32 without --enable-jit
Amend my change earlier in this series ("grep: add support for the
PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1
versions later than 8.31 compiled without --enable-jit.

As explained in that change and a later compatibility change in this
series ("grep: un-break building with PCRE < 8.32", 2017-05-10) the
pcre_jit_exec() function is a faster path to execute the JIT.

Unfortunately there's no compatibility stub for that function compiled
into the library if pcre_config(PCRE_CONFIG_JIT, &ret) would return 0,
and no macro that can be used to check for it, so the only portable
option to support builds without --enable-jit is via a new
NO_LIBPCRE1_JIT=UnfortunatelyYes Makefile option[1].

Another option would be to make the JIT opt-in via
USE_LIBPCRE1_JIT=YesPlease, after all it's not a default option of
PCRE v1.

I think it makes more sense to make it opt-out since even though it's
not a default option, most packagers of PCRE seem to turn it on by
default, with the notable exception of the MinGW package.

Make the MinGW platform work by default by changing the build defaults
to turn on NO_LIBPCRE1_JIT=UnfortunatelyYes. It is the only platform
that turns on USE_LIBPCRE=YesPlease by default, see commit
df5218b4c3 ("config.mak.uname: support MSys2", 2016-01-13) for that
change.

1. "How do I support pcre1 JIT on all
   versions?"  (https://lists.exim.org/lurker/thread/20170601.103148.10253788.en.html)

2. https://github.com/Alexpux/MINGW-packages/blob/master/mingw-w64-pcre/PKGBUILD
   (referenced from "Re: PCRE v2 compile error, was Re: What's cooking
   in git.git (May 2017, #01; Mon, 1)";
   <alpine.DEB.2.20.1705021756530.3480@virtualbox>)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 08:29:05 +09:00
Junio C Hamano
ae7785de0e Merge branch 'bp/sub-process-convert-filter'
Code from "conversion using external process" codepath has been
extracted to a separate sub-process.[ch] module.

* bp/sub-process-convert-filter:
  convert: update subprocess_read_status() to not die on EOF
  sub-process: move sub-process functions into separate files
  convert: rename reusable sub-process functions
  convert: update generic functions to only use generic data structures
  convert: separate generic structures and variables from the filter specific ones
  convert: split start_multi_file_filter() into two separate functions
  pkt-line: annotate packet_writel with LAST_ARG_MUST_BE_NULL
  convert: move packet_write_line() into pkt-line as packet_writel()
  pkt-line: add packet_read_line_gently()
  pkt-line: fix packet_read_line() to handle len < 0 errors
  convert: remove erroneous tests for errno == EPIPE
2017-05-30 11:16:42 +09:00
Ævar Arnfjörð Bjarmason
68c7d2761d test-lib: add a PTHREADS prerequisite
Add a PTHREADS prerequisite which is false when git is compiled with
NO_PTHREADS=YesPlease.

There's lots of custom code that runs when threading isn't available,
but before this prerequisite there was no way to test it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:52:37 +09:00
Ævar Arnfjörð Bjarmason
3485bea157 grep: change the internal PCRE macro names to be PCRE1
Change the internal USE_LIBPCRE define, & build options flag to use a
naming convention ending in PCRE1, without changing the long-standing
USE_LIBPCRE Makefile flag which enables this code.

This is for preparation for libpcre2 support where having things like
USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
need to for backwards compatibility with old Makefile arguments would
be confusing.

In some ways it would be better to change everything that now uses
USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
build scripts.

However I'd like to leave the door open to making
USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
the default.

This code and the USE_LIBPCRE Makefile argument was added in commit
63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was
no indication that the PCRE project would release an entirely new &
incompatible API around 3 years later.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:52:37 +09:00
Jeff Smith
f5dd754c36 blame: move origin-related methods to libgit
Signed-off-by: Jeff Smith <whydoubt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-25 13:08:22 +09:00
Ævar Arnfjörð Bjarmason
a0103914c2 sha1dc: update from upstream
Update sha1dc from the latest version by the upstream
maintainer[1].

This version includes a commit of mine which allows for replacing the
local modifications done to the upstream files in git.git with macro
definitions to monkeypatch it in place.

It also brings in a change[2] upstream made for the breakage 2.13.0
introduced on SPARC and other platforms that forbid unaligned
access[3].

This means that the code customizations done since the initial import
in commit 28dc98e343 ("sha1dc: add collision-detecting sha1
implementation", 2017-03-16) can be done purely via Makefile
definitions and by including the content of our own sha1dc_git.[ch] in
sha1dc/sha1.c via a macro.

1. cc465543b3
2. 33a694a9ee
3. "Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease
   being on by default"
   (https://public-inbox.org/git/CACBZZX6nmKK8af0-UpjCKWV4R+hV-uk2xWXVA5U+_UQ3VXU03g@mail.gmail.com/)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 10:20:46 +09:00
Ævar Arnfjörð Bjarmason
88b6197d0b perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
Add a git GIT_PERF_MAKE_COMMAND variable to compliment the existing
GIT_PERF_MAKE_OPTS facility. This allows specifying an arbitrary shell
command to execute instead of 'make'.

This is useful e.g. in cases where the name, semantics or defaults of
a Makefile flag have changed over time. It can even be used to change
the contents of the tree, useful for monkeypatching ancient versions
of git to get them to build.

This opens Pandora's box in some ways, it's now possible to
"jailbreak" the perf environment and e.g. modify the source tree via
this arbitrary instead of just issuing a custom "make" command, such a
command has to be re-entrant in the sense that subsequent perf runs
will re-use the possibly modified tree.

It would be pointless to try to mitigate or work around that caveat in
a tool purely aimed at Git developers, so this change makes no attempt
to do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21 08:25:38 +09:00
Ævar Arnfjörð Bjarmason
072473e659 Makefile & configure: reword inaccurate comment about PCRE
Reword an outdated & inaccurate comment which suggests that only
git-grep can use PCRE.

This comment was added back when PCRE support was initially added in
commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true
at the time.

It hasn't been telling the full truth since git-log learned to use
PCRE with --grep in commit 727b6fc3ed ("log --grep: accept
--basic-regexp and --perl-regexp", 2012-10-03), and more importantly
is likely to get more inaccurate over time as more use is made of PCRE
in other areas.

Reword it to be more future-proof, and to more clearly explain that
this enables user-initiated runtime behavior.

Copy/pasting this so much in configure.ac is lame, these Makefile-like
flags aren't even used by autoconf, just the corresponding
--with[out]-* options. But copy/pasting the comments that make sense
for the Makefile to configure.ac where they make less sense is the
pattern everything else follows in that file. I'm not going to war
against that as part of this change, just following the existing
pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21 08:25:37 +09:00
Ben Peart
99605d62e8 sub-process: move sub-process functions into separate files
Move the sub-proces functions into sub-process.h/c.  Add documentation
for the new module in Documentation/technical/api-sub-process.txt

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-15 13:01:57 +09:00
Junio C Hamano
77b34eaa07 Merge branch 'mh/separate-ref-cache'
The internals of the refs API around the cached refs has been
streamlined.

* mh/separate-ref-cache:
  do_for_each_entry_in_dir(): delete function
  files_pack_refs(): use reference iteration
  commit_packed_refs(): use reference iteration
  cache_ref_iterator_begin(): make function smarter
  get_loose_ref_cache(): new function
  get_loose_ref_dir(): function renamed from get_loose_refs()
  do_for_each_entry_in_dir(): eliminate `offset` argument
  refs: handle "refs/bisect/" in `loose_fill_ref_dir()`
  ref-cache: use a callback function to fill the cache
  refs: record the ref_store in ref_cache, not ref_dir
  ref-cache: introduce a new type, ref_cache
  refs: split `ref_cache` code into separate files
  ref-cache: rename `remove_entry()` to `remove_entry_from_dir()`
  ref-cache: rename `find_ref()` to `find_ref_entry()`
  ref-cache: rename `add_ref()` to `add_ref_entry()`
  refs_verify_refname_available(): use function in more places
  refs_verify_refname_available(): implement once for all backends
  refs_ref_iterator_begin(): new function
  refs_read_raw_ref(): new function
  get_ref_dir(): don't call read_loose_refs() for "refs/bisect"
2017-04-26 15:39:13 +09:00
Junio C Hamano
6cbc478d83 Merge branch 'jh/add-index-entry-optim'
"git checkout" that handles a lot of paths has been optimized by
reducing the number of unnecessary checks of paths in the
has_dir_name() function.

* jh/add-index-entry-optim:
  read-cache: speed up has_dir_name (part 2)
  read-cache: speed up has_dir_name (part 1)
  read-cache: speed up add_index_entry during checkout
  p0006-read-tree-checkout: perf test to time read-tree
  read-cache: add strcmp_offset function
2017-04-26 15:39:07 +09:00
Junio C Hamano
8377f34540 Merge branch 'jh/memihash-opt'
Hotfix for a topic that is already in 'master'.

* jh/memihash-opt:
  p0004: make perf test executable
  t3008: skip lazy-init test on a single-core box
  test-online-cpus: helper to return cpu count
  name-hash: fix buffer overrun
2017-04-19 21:37:25 -07:00
Junio C Hamano
5ab8f2261f Merge branch 'nd/files-backend-git-dir'
The "submodule" specific field in the ref_store structure is
replaced with a more generic "gitdir" that can later be used also
when dealing with ref_store that represents the set of refs visible
from the other worktrees.

* nd/files-backend-git-dir: (28 commits)
  refs.h: add a note about sorting order of for_each_ref_*
  t1406: new tests for submodule ref store
  t1405: some basic tests on main ref store
  t/helper: add test-ref-store to test ref-store functions
  refs: delete pack_refs() in favor of refs_pack_refs()
  files-backend: avoid ref api targeting main ref store
  refs: new transaction related ref-store api
  refs: add new ref-store api
  refs: rename get_ref_store() to get_submodule_ref_store() and make it public
  files-backend: replace submodule_allowed check in files_downcast()
  refs: move submodule code out of files-backend.c
  path.c: move some code out of strbuf_git_path_submodule()
  refs.c: make get_main_ref_store() public and use it
  refs.c: kill register_ref_store(), add register_submodule_ref_store()
  refs.c: flatten get_ref_store() a bit
  refs: rename lookup_ref_store() to lookup_submodule_ref_store()
  refs.c: introduce get_main_ref_store()
  files-backend: remove the use of git_path()
  files-backend: add and use files_ref_path()
  files-backend: add and use files_reflog_path()
  ...
2017-04-19 21:37:19 -07:00
Junio C Hamano
3817d631de Merge branch 'ab/regen-perl-mak-with-different-perl'
Update the build dependency so that an update to /usr/bin/perl
etc. result in recomputation of perl.mak file.

* ab/regen-perl-mak-with-different-perl:
  perl: regenerate perl.mak if perl -V changes
2017-04-16 23:29:33 -07:00
Michael Haggerty
958f964691 refs: split ref_cache code into separate files
The `ref_cache` code is currently too tightly coupled to
`files-backend`, making the code harder to understand and making it
awkward for new code to use `ref_cache` (as we indeed have planned).
Start loosening that coupling by splitting `ref_cache` into a separate
module.

This commit moves code, adds declarations, and changes the visibility
of some functions, but doesn't change any code.

The modules are still too tightly coupled, but the situation will be
improved in subsequent commits.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:45 -07:00
Jeff Hostetler
a6db3fbb6e read-cache: add strcmp_offset function
Add strcmp_offset() function to also return the offset of the
first change.

Add unit test and helper to verify.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-15 02:21:12 -07:00
Nguyễn Thái Ngọc Duy
80f2a6097c t/helper: add test-ref-store to test ref-store functions
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-14 03:53:25 -07:00
Jeff Hostetler
e3482ccf27 test-online-cpus: helper to return cpu count
Created helper executable to print the value of online_cpus()
allowing multi-threaded tests to be skipped when appropriate.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 23:17:19 -07:00
Junio C Hamano
876eb616d3 Merge branch 'jk/make-coccicheck-detect-errors'
Build fix.

* jk/make-coccicheck-detect-errors:
  Makefile: detect errors in running spatch
2017-03-30 14:07:19 -07:00
Ævar Arnfjörð Bjarmason
c59c4939c2 perl: regenerate perl.mak if perl -V changes
Change the perl/perl.mak build process so that the file is regenerated
if the output of "perl -V" changes.

Before this change updating e.g. /usr/bin/perl to a new major version
would cause the next "make" command to fail, since perl.mak has
hardcoded paths to perl library paths retrieved from its first run.

Now the logic added in commit ee9be06770 ("perl: detect new files in
MakeMaker builds", 2012-07-27) is extended to regenerate
perl/perl.mak if there's any change to "perl -V".

This will in some cases redundantly trigger perl/perl.mak to be
re-made, e.g. if @INC is modified in ways the build process doesn't
care about through sitecustomize.pl, but the common case is that we
just do the right thing and re-generate perl/perl.mak when needed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-29 09:48:13 -07:00
Jeff King
f5c2bc2b96 Makefile: detect errors in running spatch
The "make coccicheck" target runs spatch against each source
file. But it does so in a for loop, so "make" never sees the
exit code of spatch. Worse, it redirects stderr to a log
file, so the user has no indication of any failure. And then
to top it all off, because we touched the patch file's
mtime, make will refuse to repeat the command because it
think the target is up-to-date.

So for example:

  $ make coccicheck SPATCH=does-not-exist
      SPATCH contrib/coccinelle/free.cocci
      SPATCH contrib/coccinelle/qsort.cocci
      SPATCH contrib/coccinelle/xstrdup_or_null.cocci
      SPATCH contrib/coccinelle/swap.cocci
      SPATCH contrib/coccinelle/strbuf.cocci
      SPATCH contrib/coccinelle/object_id.cocci
      SPATCH contrib/coccinelle/array.cocci
  $ make coccicheck SPATCH=does-not-exist
  make: Nothing to be done for 'coccicheck'.

With this patch, you get:

  $ make coccicheck SPATCH=does-not-exist
       SPATCH contrib/coccinelle/free.cocci
  /bin/sh: 4: does-not-exist: not found
  Makefile:2338: recipe for target 'contrib/coccinelle/free.cocci.patch' failed
  make: *** [contrib/coccinelle/free.cocci.patch] Error 1

It also dumps the log on failure, so any errors from spatch
itself (like syntax errors in our .cocci files) will be seen
by the user.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-29 09:07:12 -07:00
Junio C Hamano
0330344e0f Merge branch 'jh/memihash-opt'
The name-hash used for detecting paths that are different only in
cases (which matter on case insensitive filesystems) has been
optimized to take advantage of multi-threading when it makes sense.

* jh/memihash-opt:
  name-hash: add test-lazy-init-name-hash to .gitignore
  name-hash: add perf test for lazy_init_name_hash
  name-hash: add test-lazy-init-name-hash
  name-hash: perf improvement for lazy_init_name_hash
  hashmap: document memihash_cont, hashmap_disallow_rehash api
  hashmap: add disallow_rehash setting
  hashmap: allow memihash computation to be continued
  name-hash: specify initial size for istate.dir_hash table
2017-03-28 14:06:00 -07:00
Junio C Hamano
48b3693d3c Merge branch 'jk/sha1dc'
The "detect attempt to create collisions" variant of SHA-1
implementation by Marc Stevens (CWI) and Dan Shumow (Microsoft)
has been integrated and made the default.

* jk/sha1dc:
  Makefile: make DC_SHA1 the default
  t0013: add a basic sha1 collision detection test
  Makefile: add DC_SHA1 knob
  sha1dc: disable safe_hash feature
  sha1dc: adjust header includes for git
  sha1dc: add collision-detecting sha1 implementation
2017-03-24 13:07:38 -07:00
Jeff Hostetler
ea19489532 name-hash: add test-lazy-init-name-hash
Add t/helper/test-lazy-init-name-hash.c test code
to demonstrate performance times for lazy_init_name_hash()
using the original single-threaded and the new multi-threaded
code paths.

Includes a --dump option to dump the created hashmaps to
stdout.  You can use this to run both code paths and
confirm that they generate the same hashmaps.

Includes a --analyze option to analyze performance of both
code paths over a range of index sizes to help you find a
lower bound for the LAZY_THREAD_COST in name-hash.c.
For example, passing "-a 4000" will set "istate.cache_nr"
to 4000 and then try the multi-threaded code -- probably
giving 2 threads with 2000 entries each.  It will then
run both the single-threaded (1x4000) and the multi-threaded
(2x2000) and compare the times.  It will then repeat the
test with 8000, 12000, and etc. so that you can see the
cross over.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-24 11:00:03 -07:00
Junio C Hamano
81944e9b54 Merge branch 'bc/sha1-header-selection-with-cpp-macros'
Our source code has used the SHA1_HEADER cpp macro after "#include"
in the C code to switch among the SHA-1 implementations. Instead,
list the exact header file names and switch among implementations
using "#ifdef BLK_SHA1/#include "block-sha1/sha1.h"/.../#endif";
this helps some IDE tools.

* bc/sha1-header-selection-with-cpp-macros:
  hash.h: move SHA-1 implementation selection into a header file
2017-03-17 13:50:27 -07:00
Junio C Hamano
0bb80ab090 Merge branch 'jk/interop-test'
Picking two versions of Git and running tests to make sure the
older one and the newer one interoperate happily has now become
possible.

* jk/interop-test:
  t/interop: add test of old clients against modern git-daemon
  t: add an interoperability test harness
2017-03-17 13:50:24 -07:00
Junio C Hamano
e6b07da278 Makefile: make DC_SHA1 the default
We used to use the SHA1 implementation from the OpenSSL library by
default.  As we are trying to be careful against collision attacks
after the recent "shattered" announcement, switch the default to
encourage people to use DC_SHA1 implementation instead.  Those who
want to use the implementation from OpenSSL can explicitly ask for
it by OPENSSL_SHA1=YesPlease when running "make".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17 10:40:25 -07:00
Jeff King
f5f5e7f06c t0013: add a basic sha1 collision detection test
We don't actually have a Git-object collision, so the best
we can do is to run one of the shattered PDFs through
test-sha1. This should trigger the collision check and die.

In a sense this isn't really checking anything that the
upstream sha1collisiondetection project doesn't cover
already. But it at least makes sure that our build correctly
uses the library.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17 10:40:25 -07:00
Jeff King
8325e43b82 Makefile: add DC_SHA1 knob
This knob lets you use the sha1dc implementation from:

      https://github.com/cr-marcstevens/sha1collisiondetection

which can detect certain types of collision attacks (even
when we only see half of the colliding pair). So it
mitigates any attack which consists of getting the "good"
half of a collision into a trusted repository, and then
later replacing it with the "bad" half. The "good" half is
rejected by the victim's version of Git (and even if they
run an old version of Git, any sha1dc-enabled git will
complain loudly if it ever has to interact with the object).

The big downside is that it's slower than either the openssl
or block-sha1 implementations.

Here are some timings based off of linux.git:

  - compute sha1 over whole packfile
      sha1dc: 3.580s
    blk-sha1: 2.046s (-43%)
     openssl: 1.335s (-62%)

  - rev-list --all --objects
      sha1dc: 33.512s
    blk-sha1: 33.514s (+0.0%)
     openssl: 33.650s (+0.4%)

  - git log --no-merges -10000 -p
      sha1dc: 8.124s
    blk-sha1: 7.986s (-1.6%)
     openssl: 8.203s (+0.9%)

  - index-pack --verify
      sha1dc: 4m19s
    blk-sha1: 2m57s (-32%)
     openssl: 2m19s (-42%)

So overall the sha1 computation with collision detection is
about 1.75x slower than block-sha1, and 2.7x slower than
sha1. But of course most operations do more than just sha1.
Normal object access isn't really slowed at all (both the
+/- changes there are well within the run-to-run noise); any
changes are drowned out by the other work Git is doing.

The most-affected operation is `index-pack --verify`, which
is essentially just computing the sha1 on every object. This
is similar to the `index-pack` invocation that the receiver
of a push or fetch would perform. So clearly there's some
extra CPU load here.

There will also be some latency for the user, though keep in
mind that such an operation will generally be network bound
(this is about a 1.2GB packfile). Some of that extra CPU is
"free" in the sense that we use it while the pack is
streaming in anyway. But most of it comes during the
delta-resolution phase, after the whole pack has been
received. So we can imagine that for this (quite large)
push, the user might have to wait an extra 100 seconds over
openssl (which is what we use now). If we assume they can
push to us at 20Mbit/s, that's 480s for a 1.2GB pack, which
is only 20% slower.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17 10:40:25 -07:00
brian m. carlson
f18f816cb1 hash.h: move SHA-1 implementation selection into a header file
Many developers use functionality in their editors that allows for quick
syntax checks, including warning about questionable constructs.  This
functionality allows rapid development with fewer errors.  However, such
functionality generally does not allow the specification of
project-specific defines or command-line options.

Since the SHA1_HEADER include is not defined in such a case,
developers see spurious errors when using these tools.  Furthermore,
there are known implementations of "cc" whose '#include' is unhappy
with this construct.

Instead of using SHA1_HEADER, create a hash.h header and use #if
and #elif to select the desired header.  Have the Makefile pass an
appropriate option to help the header select the right implementation to
use.

[jc: make BLK_SHA1 the fallback default as discussed on list,
e.g. <20170314201424.vccij5z2ortq4a4o@sigill.intra.peff.net>; also
remove SHA1_HEADER and SHA1_HEADER_SQ that are no longer used].

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-15 11:00:09 -07:00
Jeff King
3d8936153d t: add an interoperability test harness
The current test suite is good at letting you test a
particular version of Git. But it's not very good at letting
you test _two_ versions and seeing how they interact (e.g.,
one cloning from the other).

This commit adds a test harness that will build two
arbitrary versions of git and make it easy to call them from
inside your tests. See the README and the example script for
details.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-10 14:30:25 -08:00
Junio C Hamano
fb907176de Merge branch 'rj/remove-unused-mktemp'
Code cleanup.

* rj/remove-unused-mktemp:
  wrapper.c: remove unused gitmkstemps() function
  wrapper.c: remove unused git_mkstemp() function
2017-03-10 13:24:24 -08:00
Ramsay Jones
b2d593a779 wrapper.c: remove unused gitmkstemps() function
The last call to the mkstemps() function was removed in commit 659488326
("wrapper.c: delete dead function git_mkstemps()", 22-04-2016). In order
to support platforms without mkstemps(), this functionality was provided,
along with a Makefile build variable (NO_MKSTEMPS), by the gitmkstemps()
function. Remove the dead code, along with the defunct build machinery.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-28 11:54:21 -08:00
Junio C Hamano
098ed50e8a Merge branch 'js/rebase-helper'
"git rebase -i" starts using the recently updated "sequencer" code.

* js/rebase-helper:
  rebase -i: use the rebase--helper builtin
  rebase--helper: add a builtin helper for interactive rebases
2017-02-27 13:57:14 -08:00
Johannes Schindelin
4557f1add2 rebase--helper: add a builtin helper for interactive rebases
Git's interactive rebase is still implemented as a shell script, despite
its complexity. This implies that it suffers from the portability point
of view, from lack of expressibility, and of course also from
performance. The latter issue is particularly serious on Windows, where
we pay a hefty price for relying so much on POSIX.

Unfortunately, being such a huge shell script also means that we missed
the train when it would have been relatively easy to port it to C, and
instead piled feature upon feature onto that poor script that originally
never intended to be more than a slightly pimped cherry-pick in a loop.

To open the road toward better performance (in addition to all the other
benefits of C over shell scripts), let's just start *somewhere*.

The approach taken here is to add a builtin helper that at first intends
to take care of the parts of the interactive rebase that are most
affected by the performance penalties mentioned above.

In particular, after we spent all those efforts on preparing the sequencer
to process rebase -i's git-rebase-todo scripts, we implement the `git
rebase -i --continue` functionality as a new builtin, git-rebase--helper.

Once that is in place, we can work gradually on tackling the rest of the
technical debt.

Note that the rebase--helper needs to learn about the transient
--ff/--no-ff options of git-rebase, as the corresponding flag is not
persisted to, and re-read from, the state directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-09 14:55:26 -08:00
Jeff King
29c2bd5fa8 add oidset API
This is similar to many of our uses of sha1-array, but it
overcomes one limitation of a sha1-array: when you are
de-duplicating a large input with relatively few unique
entries, sha1-array uses 20 bytes per non-unique entry.
Whereas this set will use memory linear in the number of
unique entries (albeit a few more than 20 bytes due to
hashmap overhead).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Junio C Hamano
a77fe4a976 Merge branch 'bc/use-asciidoctor-opt'
Asciidoctor, an alternative reimplementation of AsciiDoc, still
needs some changes to work with documents meant to be formatted
with AsciiDoc.  "make USE_ASCIIDOCTOR=YesPlease" to use it out of
the box to document our pages is getting closer to reality.

* bc/use-asciidoctor-opt:
  Documentation: implement linkgit macro for Asciidoctor
  Makefile: add a knob to enable the use of Asciidoctor
  Documentation: move dblatex arguments into variable
  Documentation: add XSLT to fix DocBook for Texinfo
  Documentation: sort sources for gitman.texi
  Documentation: remove unneeded argument in cat-texi.perl
  Documentation: modernize cat-texi.perl
  Documentation: fix warning in cat-texi.perl
2017-02-02 13:36:57 -08:00
Junio C Hamano
9dec2c652f Merge branch 'js/retire-relink'
Cruft removal.

* js/retire-relink:
  relink: really remove the command
  relink: retire the command
2017-02-02 13:36:54 -08:00
Junio C Hamano
b7786bb4b0 Merge branch 'js/difftool-builtin'
Rewrite a scripted porcelain "git difftool" in C.

* js/difftool-builtin:
  difftool: hack around -Wzero-length-format warning
  difftool: retire the scripted version
  difftool: implement the functionality in the builtin
  difftool: add a skeleton for the upcoming builtin
2017-01-31 13:15:00 -08:00
Junio C Hamano
6ad8b8e98f Merge branch 'rs/qsort-s'
A few codepaths had to rely on a global variable when sorting
elements of an array because sort(3) API does not allow extra data
to be passed to the comparison function.  Use qsort_s() when
natively available, and a fallback implementation of it when not,
to eliminate the need, which is a prerequisite for making the
codepath reentrant.

* rs/qsort-s:
  ref-filter: use QSORT_S in ref_array_sort()
  string-list: use QSORT_S in string_list_sort()
  perf: add basic sort performance test
  add QSORT_S
  compat: add qsort_s()
2017-01-31 13:15:00 -08:00
Johannes Schindelin
ed21e30fef relink: retire the command
Back in the olden days, when all objects were loose and rubber boots were
made out of wood, it made sense to try to share (immutable) objects
between repositories.

Ever since the arrival of pack files, it is but an anachronism.

Let's move the script to the contrib/examples/ directory and no longer
offer it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-25 14:42:37 -08:00
René Scharfe
04ee8b875b compat: add qsort_s()
The function qsort_s() was introduced with C11 Annex K; it provides the
ability to pass a context pointer to the comparison function, supports
the convention of using a NULL pointer for an empty array and performs a
few safety checks.

Add an implementation based on compat/qsort.c for platforms that lack a
native standards-compliant qsort_s() (i.e. basically everyone).  It
doesn't perform the full range of possible checks: It uses size_t
instead of rsize_t and doesn't check nmemb and size against RSIZE_MAX
because we probably don't have the restricted size type defined.  For
the same reason it returns int instead of errno_t.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-23 11:02:34 -08:00
brian m. carlson
ec3366eb52 Makefile: add a knob to enable the use of Asciidoctor
While Git has traditionally built its documentation using AsciiDoc, some
people wish to use Asciidoctor for speed or other reasons.  Add a
Makefile knob, USE_ASCIIDOCTOR, that sets various options in order to
produce acceptable output.  For HTML output, XHTML5 was chosen, since
the AsciiDoc options also produce XHTML, albeit XHTML 1.1.

Asciidoctor does not have built-in support for the linkgit macro, but it
is available using the Asciidoctor Extensions Lab.  Add a macro to
enable the use of this extension if it is available.  Without it, the
linkgit macros are emitted into the output.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-23 10:56:57 -08:00
Johannes Schindelin
019678d6b1 difftool: retire the scripted version
It served its purpose, but now we have a builtin difftool. Time for the
Perl script to enjoy Florida.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-19 13:23:43 -08:00
Johannes Schindelin
be8a90e59c difftool: add a skeleton for the upcoming builtin
This adds a builtin difftool that still falls back to the legacy Perl
version, which has been renamed to `legacy-difftool`.

The idea is that the new, experimental, builtin difftool immediately hands
off to the legacy difftool for now, unless the config variable
difftool.useBuiltin is set to true.

This feature flag will be used in the upcoming Git for Windows v2.11.0
release, to allow early testers to opt-in to use the builtin difftool and
flesh out any bugs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-17 13:32:47 -08:00
Steven Penny
aa38ad2b24 Makefile: put LIBS after LDFLAGS for imap-send
This matches up with the targets git-%, git-http-fetch, git-http-push
and git-remote-testsvn. It must be done this way in Cygwin else lcrypto
cannot find lgdi32 and lws2_32.

Signed-off-by: Steven Penny <svnpenn@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-09 06:31:52 -08:00
Steven Penny
7c44b33f8b Makefile: POSIX windres
When environment variable POSIXLY_CORRECT is set, the
"input -o output" syntax is not supported.

  http://cygwin.com/ml/cygwin/2017-01/msg00036.html

Use "-i input -o output" syntax instead.

Signed-off-by: Steven Penny <svnpenn@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-09 01:56:22 -08:00
Junio C Hamano
1d73f8e86d Merge branch 'va/i18n-perl-scripts'
Porcelain scripts written in Perl are getting internationalized.

* va/i18n-perl-scripts:
  i18n: difftool: mark warnings for translation
  i18n: send-email: mark composing message for translation
  i18n: send-email: mark string with interpolation for translation
  i18n: send-email: mark warnings and errors for translation
  i18n: send-email: mark strings for translation
  i18n: add--interactive: mark status words for translation
  i18n: add--interactive: remove %patch_modes entries
  i18n: add--interactive: mark edit_hunk_manually message for translation
  i18n: add--interactive: i18n of help_patch_cmd
  i18n: add--interactive: mark patch prompt for translation
  i18n: add--interactive: mark plural strings
  i18n: clean.c: match string with git-add--interactive.perl
  i18n: add--interactive: mark strings with interpolation for translation
  i18n: add--interactive: mark simple here-documents for translation
  i18n: add--interactive: mark strings for translation
  Git.pm: add subroutines for commenting lines
2016-12-27 00:11:40 -08:00
Junio C Hamano
09b4fdb5f3 Merge branch 'jk/make-tags-find-sources-tweak'
Update the procedure to generate "tags" for developer support.

* jk/make-tags-find-sources-tweak:
  Makefile: exclude contrib from FIND_SOURCE_FILES
  Makefile: match shell scripts in FIND_SOURCE_FILES
  Makefile: exclude test cruft from FIND_SOURCE_FILES
  Makefile: reformat FIND_SOURCE_FILES
2016-12-19 14:45:37 -08:00
Vasco Almeida
0539d5e6d5 i18n: add--interactive: mark patch prompt for translation
Mark prompt message assembled in place for translation, unfolding each
use case for each entry in the %patch_modes hash table.

Previously, this script relied on whether $patch_mode was set to run the
command patch_update_cmd() or show status and loop the main loop. Now,
it uses $cmd to indicate we must run patch_update_cmd() and $patch_mode
is used to tell which flavor of the %patch_modes are we on.  This is
introduced in order to be able to mark and unfold the message prompt
knowing in which context we are.

The tracking of context was done previously by point %patch_mode_flavour
hash table to the correct entry of %patch_modes, focusing only on value
of %patch_modes. Now, we are also interested in the key ('staged',
'stash', 'checkout_head', ...).

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14 11:00:05 -08:00
Vasco Almeida
c4a85c3b8e i18n: add--interactive: mark plural strings
Mark plural strings for translation.  Unfold each action case in one
entire sentence.

Pass new keyword for xgettext to extract.

Update test to include new subroutine __n() for plural strings handling.

Update documentation to include a description of the new __n()
subroutine.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14 11:00:05 -08:00
Jeff King
046e4c1c09 Makefile: exclude contrib from FIND_SOURCE_FILES
When you're working on the git project, you're unlikely to
care about random bits in contrib/ (e.g., you would not want
to jump to the copy of xmalloc in the wincred credential
helper). Nobody has really complained because there are
relatively few C files in contrib.

Now that we're matching shell scripts, too, we get quite a
few more hits, especially in the obsolete contrib/examples
directory. Looking for usage() should turn up the one in
git-sh-setup, not in some long-dead version of git-clone.

Let's just exclude all of contrib. Any specific projects
there which are big enough to want tags can generate them
separately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14 09:54:49 -08:00
Jeff King
8fa2043293 Makefile: match shell scripts in FIND_SOURCE_FILES
We feed FIND_SOURCE_FILES to ctags to help developers
navigate to particular functions, but we only feed C source
code. The same feature can be helpful when working with
shell scripts (especially the test suite). Modern versions
of ctags know how to parse shell scripts; we just need to
feed the filenames to it.

This patch specifically avoids including the individual test
scripts themselves. Those are unlikely to be of interest,
and there are a lot of them to process. It does pick up
test-lib.sh and test-lib-functions.sh.

Note that our negative pathspec already excludes the
individual scripts for the ls-files case, but we need to
loosen the `find` rule to match it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14 09:54:49 -08:00
Jeff King
e6fc85b11f Makefile: exclude test cruft from FIND_SOURCE_FILES
The test directory may contain three types of files that
match our patterns:

  1. Helper programs in t/helper.

  2. Sample data files (e.g., t/t4051/hello.c).

  3. Untracked cruft in trash directories and t/perf/build.

We want to match (1), but not the other two, as they just
clutter up the list.

For the ls-files method, we can drop (2) with a negative
pathspec. We do not have to care about (3), since ls-files
will not list untracked files.

For `find`, we can match both cases with `-prune` patterns.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14 09:54:49 -08:00
Jeff King
e951ebca91 Makefile: reformat FIND_SOURCE_FILES
As we add to this in future commits, the formatting is going
to make it harder and harder to read. Let's write it more as
we would in a shell script, putting each logical block on
its own line.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-14 09:54:49 -08:00
Jeff King
1f7c926132 xdiff: drop XDL_FAST_HASH
The xdiff code hashes every line of both sides of a diff,
and then compares those hashes to find duplicates. The
overall performance depends both on how fast we can compute
the hashes, but also on how many hash collisions we see.

The idea of XDL_FAST_HASH is to speed up the hash
computation. But the generated hashes have worse collision
behavior. This means that in some cases it speeds diffs up
(running "git log -p" on git.git improves by ~8% with it),
but in others it can slow things down. One pathological case
saw over a 100x slowdown[1].

There may be a better hash function that covers both
properties, but in the meantime we are better off with the
original hash. It's slightly slower in the common case, but
it has fewer surprising pathological cases.

[1] http://public-inbox.org/git/20141222041944.GA441@peff.net/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06 13:27:11 -08:00
Junio C Hamano
729fb9ad34 Merge branch 'ls/macos-update' into maint
Portability update and workaround for builds on recent Mac OS X.

* ls/macos-update:
  travis-ci: disable GIT_TEST_HTTPD for macOS
  Makefile: set NO_OPENSSL on macOS by default
2016-11-29 13:27:56 -08:00
Junio C Hamano
332fd5655a Merge branch 'ls/macos-update'
Portability update and workaround for builds on recent Mac OS X.

* ls/macos-update:
  travis-ci: disable GIT_TEST_HTTPD for macOS
  Makefile: set NO_OPENSSL on macOS by default
2016-11-11 13:56:30 -08:00