git-commit-vandalism/t/perf/README

Git performance tests
=====================

This directory holds performance testing scripts for git tools.  The
first part of this document describes the various ways in which you
can run them.

When fixing the tools or adding enhancements, you are strongly
encouraged to add tests in this directory to cover what you are
trying to fix or enhance.  The later part of this short document
describes how your test scripts should be organized.


Running Tests
-------------

The easiest way to run tests is to say "make".  This runs all
the tests on the current git repository.

    === Running 2 tests in this tree ===
    [...]
    Test                                     this tree
    ---------------------------------------------------------
    0001.1: rev-list --all                   0.54(0.51+0.02)
    0001.2: rev-list --all --objects         6.14(5.99+0.11)
    7810.1: grep worktree, cheap regex       0.16(0.16+0.35)
    7810.2: grep worktree, expensive regex   7.90(29.75+0.37)
    7810.3: grep --cached, cheap regex       3.07(3.02+0.25)
    7810.4: grep --cached, expensive regex   9.39(30.57+0.24)

You can compare multiple repositories and even git revisions with the
'run' script:

    $ ./run . origin/next /path/to/git-tree p0001-rev-list.sh

where . stands for the current git tree.  The full invocation is

    ./run [<revision|directory>...] [--] [<test-script>...]

A '.' argument is implied if you do not pass any other
revisions/directories.

You can also manually test this or another git build tree, and then
call the aggregation script to summarize the results:

    $ ./p0001-rev-list.sh
    [...]
    $ GIT_BUILD_DIR=/path/to/other/git ./p0001-rev-list.sh
    [...]
    $ ./aggregate.perl . /path/to/other/git ./p0001-rev-list.sh

aggregate.perl has the same invocation as 'run', it just does not run
anything beforehand.

You can set the following variables (also in your config.mak):

    GIT_PERF_REPEAT_COUNT
	Number of times a test should be repeated for best-of-N
	measurements.  Defaults to 3.

    GIT_PERF_MAKE_OPTS
	Options to use when automatically building a git tree for
	performance testing. E.g., -j6 would be useful. Passed
	directly to make as "make $GIT_PERF_MAKE_OPTS".

    GIT_PERF_MAKE_COMMAND
	An arbitrary command that'll be run in place of the make
	command, if set the GIT_PERF_MAKE_OPTS variable is
	ignored. Useful in cases where source tree changes might
	require issuing a different make command to different
	revisions.

	This can be (ab)used to monkeypatch or otherwise change the
	tree about to be built. Note that the build directory can be
	re-used for subsequent runs so the make command might get
	executed multiple times on the same tree, but don't count on
	any of that, that's an implementation detail that might change
	in the future.

    GIT_PERF_REPO
    GIT_PERF_LARGE_REPO
	Repositories to copy for the performance tests.  The normal
	repo should be at least git.git size.  The large repo should
	probably be about linux.git size for optimal results.
	Both default to the git.git you are running from.

You can also pass the options taken by ordinary git tests; the most
useful one is:

--root=<directory>::
	Create "trash" directories used to store all temporary data during
	testing under <directory>, instead of the t/ directory.
	Using this option with a RAM-based filesystem (such as tmpfs)
	can massively speed up the test suite.


Naming Tests
------------

The performance test files are named as:

	pNNNN-commandname-details.sh

where N is a decimal digit.  The same conventions for choosing NNNN as
for normal tests apply.


Writing Tests
-------------

The perf script starts much like a normal test script, except it
sources perf-lib.sh:

	#!/bin/sh
	#
	# Copyright (c) 2005 Junio C Hamano
	#

	test_description='xxx performance test'
	. ./perf-lib.sh

After that you will want to use some of the following:

	test_perf_default_repo  # sets up a "normal" repository
	test_perf_large_repo    # sets up a "large" repository

	test_perf_default_repo sub  # ditto, in a subdir "sub"

        test_checkout_worktree  # if you need the worktree too

At least one of the first two is required!

You can use test_expect_success as usual. In both test_expect_success
and in test_perf, running "git" points to the version that is being
perf-tested. The $MODERN_GIT variable points to the git wrapper for the
currently checked-out version (i.e., the one that matches the t/perf
scripts you are running).  This is useful if your setup uses commands
that only work with newer versions of git than what you might want to
test (but obviously your new commands must still create a state that can
be used by the older version of git you are testing).

For actual performance tests, use

	test_perf 'descriptive string' '
		command1 &&
		command2
	'

test_perf spawns a subshell, for lack of better options.  This means
that

* you _must_ export all variables that you need in the subshell

* you _must_ flag all variables that you want to persist from the
  subshell with 'test_export':

	test_perf 'descriptive string' '
		foo=$(git rev-parse HEAD) &&
		test_export foo
	'

  The so-exported variables are automatically marked for export in the
  shell executing the perf test.  For your convenience, test_export is
  the same as export in the main shell.

  This feature relies on a bit of magic using 'set' and 'source'.
  While we have tried to make sure that it can cope with embedded
  whitespace and other special characters, it will not work with
  multi-line data.
Introduce a performance testing framework This introduces a performance testing framework under t/perf/. It tries to be as close to the test-lib.sh infrastructure as possible, and thus should be easy to get used to for git developers. The following points were considered for the implementation: 1. You usually want to compare arbitrary revisions/build trees against each other. They may not have the performance test under consideration, or even the perf-lib.sh infrastructure. To cope with this, the 'run' script lets you specify arbitrary build dirs and revisions. It even automatically builds the revisions if it doesn't have them at hand yet. 2. Usually you would not want to run all tests. It would take too long anyway. The 'run' script lets you specify which tests to run; or you can also do it manually. There is a Makefile for discoverability and 'make clean', but it is not meant for real-world use. 3. Creating test repos from scratch in every test is extremely time-consuming, and shipping or downloading such large/weird repos is out of the question. We leave this decision to the user. Two different sizes of test repos can be configured, and the scripts just copy one or more of those (using hardlinks for the object store). By default it tries to use the build tree's git.git repository. This is fairly fast and versatile. Using a copy instead of a clone preserves many properties that the user may want to test for, such as lots of loose objects, unpacked refs, etc. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 11:25:09 +01:00			`Git performance tests`
			`=====================`

			`This directory holds performance testing scripts for git tools. The`
			`first part of this document describes the various ways in which you`
			`can run them.`

			`When fixing the tools or adding enhancements, you are strongly`
			`encouraged to add tests in this directory to cover what you are`
			`trying to fix or enhance. The later part of this short document`
			`describes how your test scripts should be organized.`


			`Running Tests`
			`-------------`

			`The easiest way to run tests is to say "make". This runs all`
			`the tests on the current git repository.`

			`=== Running 2 tests in this tree ===`
			`[...]`
			`Test this tree`
			`---------------------------------------------------------`
			`0001.1: rev-list --all 0.54(0.51+0.02)`
			`0001.2: rev-list --all --objects 6.14(5.99+0.11)`
			`7810.1: grep worktree, cheap regex 0.16(0.16+0.35)`
			`7810.2: grep worktree, expensive regex 7.90(29.75+0.37)`
			`7810.3: grep --cached, cheap regex 3.07(3.02+0.25)`
			`7810.4: grep --cached, expensive regex 9.39(30.57+0.24)`

			`You can compare multiple repositories and even git revisions with the`
			`'run' script:`

			`$ ./run . origin/next /path/to/git-tree p0001-rev-list.sh`

			`where . stands for the current git tree. The full invocation is`

			`./run [<revision\|directory>...] [--] [<test-script>...]`

			`A '.' argument is implied if you do not pass any other`
			`revisions/directories.`

			`You can also manually test this or another git build tree, and then`
			`call the aggregation script to summarize the results:`

			`$ ./p0001-rev-list.sh`
			`[...]`
			`$ GIT_BUILD_DIR=/path/to/other/git ./p0001-rev-list.sh`
			`[...]`
			`$ ./aggregate.perl . /path/to/other/git ./p0001-rev-list.sh`

			`aggregate.perl has the same invocation as 'run', it just does not run`
			`anything beforehand.`

			`You can set the following variables (also in your config.mak):`

			`GIT_PERF_REPEAT_COUNT`
			`Number of times a test should be repeated for best-of-N`
perf: update documentation of GIT_PERF_REPEAT_COUNT Currently the documentation of GIT_PERF_REPEAT_COUNT says the default is five while "perf-lib.sh" uses a value of three as a default. Update the documentation so that it is consistent with the code. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-09 16:29:25 +01:00			`measurements. Defaults to 3.`
Introduce a performance testing framework This introduces a performance testing framework under t/perf/. It tries to be as close to the test-lib.sh infrastructure as possible, and thus should be easy to get used to for git developers. The following points were considered for the implementation: 1. You usually want to compare arbitrary revisions/build trees against each other. They may not have the performance test under consideration, or even the perf-lib.sh infrastructure. To cope with this, the 'run' script lets you specify arbitrary build dirs and revisions. It even automatically builds the revisions if it doesn't have them at hand yet. 2. Usually you would not want to run all tests. It would take too long anyway. The 'run' script lets you specify which tests to run; or you can also do it manually. There is a Makefile for discoverability and 'make clean', but it is not meant for real-world use. 3. Creating test repos from scratch in every test is extremely time-consuming, and shipping or downloading such large/weird repos is out of the question. We leave this decision to the user. Two different sizes of test repos can be configured, and the scripts just copy one or more of those (using hardlinks for the object store). By default it tries to use the build tree's git.git repository. This is fairly fast and versatile. Using a copy instead of a clone preserves many properties that the user may want to test for, such as lots of loose objects, unpacked refs, etc. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 11:25:09 +01:00
			`GIT_PERF_MAKE_OPTS`
			`Options to use when automatically building a git tree for`
perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Add a git GIT_PERF_MAKE_COMMAND variable to compliment the existing GIT_PERF_MAKE_OPTS facility. This allows specifying an arbitrary shell command to execute instead of 'make'. This is useful e.g. in cases where the name, semantics or defaults of a Makefile flag have changed over time. It can even be used to change the contents of the tree, useful for monkeypatching ancient versions of git to get them to build. This opens Pandora's box in some ways, it's now possible to "jailbreak" the perf environment and e.g. modify the source tree via this arbitrary instead of just issuing a custom "make" command, such a command has to be re-entrant in the sense that subsequent perf runs will re-use the possibly modified tree. It would be pointless to try to mitigate or work around that caveat in a tool purely aimed at Git developers, so this change makes no attempt to do so. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-20 23:42:18 +02:00			`performance testing. E.g., -j6 would be useful. Passed`
			`directly to make as "make $GIT_PERF_MAKE_OPTS".`

			`GIT_PERF_MAKE_COMMAND`
			`An arbitrary command that'll be run in place of the make`
			`command, if set the GIT_PERF_MAKE_OPTS variable is`
			`ignored. Useful in cases where source tree changes might`
			`require issuing a different make command to different`
			`revisions.`

			`This can be (ab)used to monkeypatch or otherwise change the`
			`tree about to be built. Note that the build directory can be`
			`re-used for subsequent runs so the make command might get`
			`executed multiple times on the same tree, but don't count on`
			`any of that, that's an implementation detail that might change`
			`in the future.`
Introduce a performance testing framework This introduces a performance testing framework under t/perf/. It tries to be as close to the test-lib.sh infrastructure as possible, and thus should be easy to get used to for git developers. The following points were considered for the implementation: 1. You usually want to compare arbitrary revisions/build trees against each other. They may not have the performance test under consideration, or even the perf-lib.sh infrastructure. To cope with this, the 'run' script lets you specify arbitrary build dirs and revisions. It even automatically builds the revisions if it doesn't have them at hand yet. 2. Usually you would not want to run all tests. It would take too long anyway. The 'run' script lets you specify which tests to run; or you can also do it manually. There is a Makefile for discoverability and 'make clean', but it is not meant for real-world use. 3. Creating test repos from scratch in every test is extremely time-consuming, and shipping or downloading such large/weird repos is out of the question. We leave this decision to the user. Two different sizes of test repos can be configured, and the scripts just copy one or more of those (using hardlinks for the object store). By default it tries to use the build tree's git.git repository. This is fairly fast and versatile. Using a copy instead of a clone preserves many properties that the user may want to test for, such as lots of loose objects, unpacked refs, etc. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 11:25:09 +01:00
			`GIT_PERF_REPO`
			`GIT_PERF_LARGE_REPO`
			`Repositories to copy for the performance tests. The normal`
			`repo should be at least git.git size. The large repo should`
Documentation: Update 'linux-2.6.git' -> 'linux.git' The 3.x tree has been out for a while now. The -2.6 repository name survived the initial release [1], but kernel.org now only lists 'linux.git' (for aegl as well as torvalds) [2]. [1]: http://article.gmane.org/gmane.linux.kernel/1147422 On 2011-05-30 01:47:57 GMT, Linus Torvalds wrote: > ... yes, that means that my git tree is still called > "linux-2.6.git" on kernel.org. [2]: http://git.kernel.org/cgit/ Signed-off-by: W. Trevor King <wking@tremily.us> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-06-22 16:46:27 +02:00			`probably be about linux.git size for optimal results.`
Introduce a performance testing framework This introduces a performance testing framework under t/perf/. It tries to be as close to the test-lib.sh infrastructure as possible, and thus should be easy to get used to for git developers. The following points were considered for the implementation: 1. You usually want to compare arbitrary revisions/build trees against each other. They may not have the performance test under consideration, or even the perf-lib.sh infrastructure. To cope with this, the 'run' script lets you specify arbitrary build dirs and revisions. It even automatically builds the revisions if it doesn't have them at hand yet. 2. Usually you would not want to run all tests. It would take too long anyway. The 'run' script lets you specify which tests to run; or you can also do it manually. There is a Makefile for discoverability and 'make clean', but it is not meant for real-world use. 3. Creating test repos from scratch in every test is extremely time-consuming, and shipping or downloading such large/weird repos is out of the question. We leave this decision to the user. Two different sizes of test repos can be configured, and the scripts just copy one or more of those (using hardlinks for the object store). By default it tries to use the build tree's git.git repository. This is fairly fast and versatile. Using a copy instead of a clone preserves many properties that the user may want to test for, such as lots of loose objects, unpacked refs, etc. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 11:25:09 +01:00			`Both default to the git.git you are running from.`

			`You can also pass the options taken by ordinary git tests; the most`
			`useful one is:`

			`--root=<directory>::`
			`Create "trash" directories used to store all temporary data during`
			`testing under <directory>, instead of the t/ directory.`
			`Using this option with a RAM-based filesystem (such as tmpfs)`
			`can massively speed up the test suite.`


			`Naming Tests`
			`------------`

			`The performance test files are named as:`

			`pNNNN-commandname-details.sh`

			`where N is a decimal digit. The same conventions for choosing NNNN as`
			`for normal tests apply.`


			`Writing Tests`
			`-------------`

			`The perf script starts much like a normal test script, except it`
			`sources perf-lib.sh:`

			`#!/bin/sh`
			`#`
			`# Copyright (c) 2005 Junio C Hamano`
			`#`

			`test_description='xxx performance test'`
			`. ./perf-lib.sh`

			`After that you will want to use some of the following:`

			`test_perf_default_repo # sets up a "normal" repository`
			`test_perf_large_repo # sets up a "large" repository`

			`test_perf_default_repo sub # ditto, in a subdir "sub"`

			`test_checkout_worktree # if you need the worktree too`

			`At least one of the first two is required!`

t/perf: fix regression in testing older versions of git Commit 7501b59 (perf: make the tests work in worktrees, 2016-05-13) introduced the use of "git rev-parse --git-path" in the perf-lib setup code. Because the to-be-tested version of git is at the front of the $PATH when this code runs, this means we cannot use modern versions of t/perf to test versions of git older than v2.5.0 (when that option was introduced). This is a symptom of a more general problem. The t/perf suite is essentially independent of git versions, and ideally we would be able to run the most modern and complete set of tests across many historical versions (to see how they compare). But any setup code they run is therefore required to use the lowest common denominator we expect to test. So let's introduce a new variable, $MODERN_GIT, that we can use both in perf-lib and in the test setup to get a reliable set of git features (we might change git and break some tests, of course, but $MODERN_GIT is tied to the same version of git as the t/perf scripts, so they can be fixed or adjusted together). This commit fixes the "--git-path" case, but does not mass-convert existing setup code to use $MODERN_GIT. Most setup code is fairly vanilla and will work with effectively all versions. But now the tool is there to fix any other issues we find going forward. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-06-22 21:40:13 +02:00			`You can use test_expect_success as usual. In both test_expect_success`
			`and in test_perf, running "git" points to the version that is being`
			`perf-tested. The $MODERN_GIT variable points to the git wrapper for the`
			`currently checked-out version (i.e., the one that matches the t/perf`
			`scripts you are running). This is useful if your setup uses commands`
			`that only work with newer versions of git than what you might want to`
			`test (but obviously your new commands must still create a state that can`
			`be used by the older version of git you are testing).`

			`For actual performance tests, use`
Introduce a performance testing framework This introduces a performance testing framework under t/perf/. It tries to be as close to the test-lib.sh infrastructure as possible, and thus should be easy to get used to for git developers. The following points were considered for the implementation: 1. You usually want to compare arbitrary revisions/build trees against each other. They may not have the performance test under consideration, or even the perf-lib.sh infrastructure. To cope with this, the 'run' script lets you specify arbitrary build dirs and revisions. It even automatically builds the revisions if it doesn't have them at hand yet. 2. Usually you would not want to run all tests. It would take too long anyway. The 'run' script lets you specify which tests to run; or you can also do it manually. There is a Makefile for discoverability and 'make clean', but it is not meant for real-world use. 3. Creating test repos from scratch in every test is extremely time-consuming, and shipping or downloading such large/weird repos is out of the question. We leave this decision to the user. Two different sizes of test repos can be configured, and the scripts just copy one or more of those (using hardlinks for the object store). By default it tries to use the build tree's git.git repository. This is fairly fast and versatile. Using a copy instead of a clone preserves many properties that the user may want to test for, such as lots of loose objects, unpacked refs, etc. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-17 11:25:09 +01:00
			`test_perf 'descriptive string' '`
			`command1 &&`
			`command2`
			`'`

			`test_perf spawns a subshell, for lack of better options. This means`
			`that`

			`* you _must_ export all variables that you need in the subshell`

			`* you _must_ flag all variables that you want to persist from the`
			`subshell with 'test_export':`

			`test_perf 'descriptive string' '`
			`foo=$(git rev-parse HEAD) &&`
			`test_export foo`
			`'`

			`The so-exported variables are automatically marked for export in the`
			`shell executing the perf test. For your convenience, test_export is`
			`the same as export in the main shell.`

			`This feature relies on a bit of magic using 'set' and 'source'.`
			`While we have tried to make sure that it can cope with embedded`
			`whitespace and other special characters, it will not work with`
			`multi-line data.`