git-commit-vandalism/t/perf
Nguyễn Thái Ngọc Duy b8a2486f15 index-pack: support multithreaded delta resolving
This puts delta resolving on each base on a separate thread, one base
cache per thread. Per-thread data is grouped in struct thread_local.
When running with nr_threads == 1, no pthreads calls are made. The
system essentially runs in non-thread mode.

An experiment on a Xeon 24 core machine with git.git shows that
performance does not increase proportional to the number of cores. So
by default, we use maximum 3 cores. Some numbers with --threads from 1
to 16:

1..4
real    0m8.003s  0m5.307s  0m4.321s  0m3.830s
user    0m7.720s  0m8.009s  0m8.133s  0m8.305s
sys     0m0.224s  0m0.372s  0m0.360s  0m0.360s

5..8
real    0m3.727s  0m3.604s  0m3.332s  0m3.369s
user    0m9.361s  0m9.817s  0m9.525s  0m9.769s
sys     0m0.584s  0m0.624s  0m0.540s  0m0.560s

9..12
real    0m3.036s  0m3.139s  0m3.177s  0m2.961s
user    0m8.977s  0m10.205s 0m9.737s  0m10.073s
sys     0m0.596s  0m0.680s  0m0.684s  0m0.680s

13..16
real    0m2.985s  0m2.894s  0m2.975s  0m2.971s
user    0m9.825s  0m10.573s 0m10.833s 0m11.361s
sys     0m0.788s  0m0.732s  0m0.904s  0m1.016s

On an Intel dual core and linux-2.6.git

1..4
real    2m37.789s 2m7.963s  2m0.920s  1m58.213s
user    2m28.415s 2m52.325s 2m50.176s 2m41.187s
sys     0m7.808s  0m11.181s 0m11.224s 0m10.731s

Thanks Ramsay Jones for troubleshooting and support on MinGW platform.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-07 15:48:15 -07:00
..
.gitignore
aggregate.perl
Makefile
min_time.perl
p0000-perf-lib-sanity.sh
p0001-rev-list.sh
p5302-pack-index.sh index-pack: support multithreaded delta resolving 2012-05-07 15:48:15 -07:00
p7810-grep.sh
perf-lib.sh
README
run

Git performance tests
=====================

This directory holds performance testing scripts for git tools.  The
first part of this document describes the various ways in which you
can run them.

When fixing the tools or adding enhancements, you are strongly
encouraged to add tests in this directory to cover what you are
trying to fix or enhance.  The later part of this short document
describes how your test scripts should be organized.


Running Tests
-------------

The easiest way to run tests is to say "make".  This runs all
the tests on the current git repository.

    === Running 2 tests in this tree ===
    [...]
    Test                                     this tree
    ---------------------------------------------------------
    0001.1: rev-list --all                   0.54(0.51+0.02)
    0001.2: rev-list --all --objects         6.14(5.99+0.11)
    7810.1: grep worktree, cheap regex       0.16(0.16+0.35)
    7810.2: grep worktree, expensive regex   7.90(29.75+0.37)
    7810.3: grep --cached, cheap regex       3.07(3.02+0.25)
    7810.4: grep --cached, expensive regex   9.39(30.57+0.24)

You can compare multiple repositories and even git revisions with the
'run' script:

    $ ./run . origin/next /path/to/git-tree p0001-rev-list.sh

where . stands for the current git tree.  The full invocation is

    ./run [<revision|directory>...] [--] [<test-script>...]

A '.' argument is implied if you do not pass any other
revisions/directories.

You can also manually test this or another git build tree, and then
call the aggregation script to summarize the results:

    $ ./p0001-rev-list.sh
    [...]
    $ GIT_BUILD_DIR=/path/to/other/git ./p0001-rev-list.sh
    [...]
    $ ./aggregate.perl . /path/to/other/git ./p0001-rev-list.sh

aggregate.perl has the same invocation as 'run', it just does not run
anything beforehand.

You can set the following variables (also in your config.mak):

    GIT_PERF_REPEAT_COUNT
	Number of times a test should be repeated for best-of-N
	measurements.  Defaults to 5.

    GIT_PERF_MAKE_OPTS
	Options to use when automatically building a git tree for
	performance testing.  E.g., -j6 would be useful.

    GIT_PERF_REPO
    GIT_PERF_LARGE_REPO
	Repositories to copy for the performance tests.  The normal
	repo should be at least git.git size.  The large repo should
	probably be about linux-2.6.git size for optimal results.
	Both default to the git.git you are running from.

You can also pass the options taken by ordinary git tests; the most
useful one is:

--root=<directory>::
	Create "trash" directories used to store all temporary data during
	testing under <directory>, instead of the t/ directory.
	Using this option with a RAM-based filesystem (such as tmpfs)
	can massively speed up the test suite.


Naming Tests
------------

The performance test files are named as:

	pNNNN-commandname-details.sh

where N is a decimal digit.  The same conventions for choosing NNNN as
for normal tests apply.


Writing Tests
-------------

The perf script starts much like a normal test script, except it
sources perf-lib.sh:

	#!/bin/sh
	#
	# Copyright (c) 2005 Junio C Hamano
	#

	test_description='xxx performance test'
	. ./perf-lib.sh

After that you will want to use some of the following:

	test_perf_default_repo  # sets up a "normal" repository
	test_perf_large_repo    # sets up a "large" repository

	test_perf_default_repo sub  # ditto, in a subdir "sub"

        test_checkout_worktree  # if you need the worktree too

At least one of the first two is required!

You can use test_expect_success as usual.  For actual performance
tests, use

	test_perf 'descriptive string' '
		command1 &&
		command2
	'

test_perf spawns a subshell, for lack of better options.  This means
that

* you _must_ export all variables that you need in the subshell

* you _must_ flag all variables that you want to persist from the
  subshell with 'test_export':

	test_perf 'descriptive string' '
		foo=$(git rev-parse HEAD) &&
		test_export foo
	'

  The so-exported variables are automatically marked for export in the
  shell executing the perf test.  For your convenience, test_export is
  the same as export in the main shell.

  This feature relies on a bit of magic using 'set' and 'source'.
  While we have tried to make sure that it can cope with embedded
  whitespace and other special characters, it will not work with
  multi-line data.