git-commit-vandalism/gitweb
Junio C Hamano 25f745fbec Merge branch 'jn/gitweb-highlite-sanitise' into maint
* jn/gitweb-highlite-sanitise:
  gitweb: Strip non-printable characters from syntax highlighter output
2011-10-26 16:13:31 -07:00
..
static Merge branch 'maint-1.7.6' into maint 2011-10-26 16:13:27 -07:00
gitweb.perl gitweb: Strip non-printable characters from syntax highlighter output 2011-09-16 09:22:47 -07:00
INSTALL gitweb: Introduce common system-wide settings for convenience 2011-07-24 16:22:21 -07:00
Makefile gitweb: Introduce common system-wide settings for convenience 2011-07-24 16:22:21 -07:00
README gitweb: Introduce common system-wide settings for convenience 2011-07-24 16:22:21 -07:00

GIT web Interface
=================

The one working on:
  http://git.kernel.org/

From the git version 1.4.0 gitweb is bundled with git.


Runtime gitweb configuration
----------------------------

Gitweb obtains configuration data from the following sources in the
following order:

1. built-in values (some set during build stage),
2. common system-wide configuration file (`GITWEB_CONFIG_COMMON`,
   defaults to '/etc/gitweb-common.conf'),
3. either per-instance configuration file (`GITWEB_CONFIG`, defaults to
   'gitweb_config.perl' in the same directory as the installed gitweb),
   or if it does not exists then system-wide configuration file
   (`GITWEB_CONFIG_SYSTEM`, defaults to '/etc/gitweb.conf').

Values obtained in later configuration files override values obtained earlier
in above sequence.

You can read defaults in system-wide GITWEB_CONFIG_SYSTEM from GITWEB_CONFIG
by adding

  read_config_file($GITWEB_CONFIG_SYSTEM);

at very beginning of per-instance GITWEB_CONFIG file.  In this case
settings in said per-instance file will override settings from
system-wide configuration file.  Note that read_config_file checks
itself that the $GITWEB_CONFIG_SYSTEM file exists.

The most notable thing that is not configurable at compile time are the
optional features, stored in the '%features' variable.

Ultimate description on how to reconfigure the default features setting
in your `GITWEB_CONFIG` or per-project in `project.git/config` can be found
as comments inside 'gitweb.cgi'.

See also the "Gitweb config file" (with an example of config file), and
the "Gitweb repositories" sections in INSTALL file for gitweb.


The gitweb config file is a fragment of perl code. You can set variables
using "our $variable = value"; text from "#" character until the end
of a line is ignored. See perlsyn(1) man page for details.

Below is the list of variables which you might want to set in gitweb config.
See the top of 'gitweb.cgi' for the full list of variables and their
descriptions.

Gitweb config file variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can set, among others, the following variables in gitweb config files
(with the exception of $projectroot and $projects_list this list does
not include variables usually directly set during build):
 * $GIT
   Core git executable to use.  By default set to "$GIT_BINDIR/git", which
   in turn is by default set to "$(bindir)/git".  If you use git from binary
   package, set this to "/usr/bin/git".  This can just be "git" if your
   webserver has a sensible PATH.  If you have multiple git versions
   installed it can be used to choose which one to use.
 * $version
   Gitweb version, set automatically when creating gitweb.cgi from
   gitweb.perl. You might want to modify it if you are running modified
   gitweb.
 * $projectroot
   Absolute filesystem path which will be prepended to project path;
   the path to repository is $projectroot/$project.  Set to
   $GITWEB_PROJECTROOT during installation.  This variable have to be
   set correctly for gitweb to find repositories.
 * $projects_list
   Source of projects list, either directory to scan, or text file
   with list of repositories (in the "<URI-encoded repository path> SP
   <URI-encoded repository owner>" line format; actually there can be
   any sequence of whitespace in place of space (SP)).  Set to
   $GITWEB_LIST during installation.  If empty, $projectroot is used
   to scan for repositories.
 * $my_url, $my_uri
   Full URL and absolute URL of gitweb script;
   in earlier versions of gitweb you might have need to set those
   variables, now there should be no need to do it.  See
   $per_request_config if you need to set them still.
 * $base_url
   Base URL for relative URLs in pages generated by gitweb,
   (e.g. $logo, $favicon, @stylesheets if they are relative URLs),
   needed and used only for URLs with nonempty PATH_INFO via
   <base href="$base_url">.  Usually gitweb sets its value correctly,
   and there is no need to set this variable, e.g. to $my_uri or "/".
   See $per_request_config if you need to set it anyway.
 * $home_link
   Target of the home link on top of all pages (the first part of view
   "breadcrumbs").  By default set to absolute URI of a page ($my_uri).
 * @stylesheets
   List of URIs of stylesheets (relative to base URI of a page). You
   might specify more than one stylesheet, for example use gitweb.css
   as base, with site specific modifications in separate stylesheet
   to make it easier to upgrade gitweb. You can add 'site' stylesheet
   for example by using
      push @stylesheets, "gitweb-site.css";
   in the gitweb config file.
 * $logo_url, $logo_label
   URI and label (title) of GIT logo link (or your site logo, if you choose
   to use different logo image). By default they point to git homepage;
   in the past they pointed to git documentation at www.kernel.org.
 * $projects_list_description_width
   The width (in characters) of the projects list "Description" column.
   Longer descriptions will be cut (trying to cut at word boundary);
   full description is available as 'title' attribute (usually shown on
   mouseover).  By default set to 25, which might be too small if you
   use long project descriptions.
 * $projects_list_group_categories
   Enables the grouping of projects by category on the project list page.
   The category of a project is determined by the $GIT_DIR/category
   file or the 'gitweb.category' variable in its repository configuration.
   Disabled by default.
 * $project_list_default_category
   Default category for projects for which none is specified.  If set
   to the empty string, such projects will remain uncategorized and
   listed at the top, above categorized projects.
 * @git_base_url_list
   List of git base URLs used for URL to where fetch project from, shown
   in project summary page.  Full URL is "$git_base_url/$project".
   You can setup multiple base URLs (for example one for  git:// protocol
   access, and one for http:// "dumb" protocol access).  Note that per
   repository configuration in 'cloneurl' file, or as values of gitweb.url
   project config.
 * $default_blob_plain_mimetype
   Default mimetype for blob_plain (raw) view, if mimetype checking
   doesn't result in some other type; by default 'text/plain'.
 * $default_text_plain_charset
   Default charset for text files. If not set, web server configuration
   would be used.
 * $mimetypes_file
   File to use for (filename extension based) guessing of MIME types before
   trying /etc/mime.types. Path, if relative, is taken currently as
   relative to the current git repository.
 * $fallback_encoding
   Gitweb assumes this charset if line contains non-UTF-8 characters.
   Fallback decoding is used without error checking, so it can be even
   'utf-8'. Value must be valid encoding; see Encoding::Supported(3pm) man
   page for a list.   By default 'latin1', aka. 'iso-8859-1'.
 * @diff_opts
   Rename detection options for git-diff and git-diff-tree. By default
   ('-M'); set it to ('-C') or ('-C', '-C') to also detect copies, or
   set it to () if you don't want to have renames detection.
 * $prevent_xss
   If true, some gitweb features are disabled to prevent content in
   repositories from launching cross-site scripting (XSS) attacks.  Set this
   to true if you don't trust the content of your repositories. The default
   is false.
 * $maxload
   Used to set the maximum load that we will still respond to gitweb queries.
   If server load exceed this value then return "503 Service Unavailable" error.
   Server load is taken to be 0 if gitweb cannot determine its value.  Set it to
   undefined value to turn it off.  The default is 300.
 * $highlight_bin
   Path to the highlight executable to use (must be the one from
   http://www.andre-simon.de due to assumptions about parameters and output).
   Useful if highlight is not installed on your webserver's PATH.
   [Default: highlight]
 * $per_request_config
   If set to code reference, it would be run once per each request.  You can
   set parts of configuration that change per session, e.g. by setting it to
     sub { $ENV{GL_USER} = $cgi->remote_user || "gitweb"; }
   Otherwise it is treated as boolean value: if true gitweb would process
   config file once per request, if false it would process config file only
   once.  Note: $my_url, $my_uri, and $base_url are overwritten with
   their default values before every request, so if you want to change
   them, be sure to set this variable to true or a code reference effecting
   the desired changes.  The default is true.

Projects list file format
~~~~~~~~~~~~~~~~~~~~~~~~~

Instead of having gitweb find repositories by scanning filesystem starting
from $projectroot (or $projects_list, if it points to directory), you can
provide list of projects by setting $projects_list to a text file with list
of projects (and some additional info).  This file uses the following
format:

One record (for project / repository) per line, whitespace separated fields;
does not support (at least for now) lines continuation (newline escaping).
Leading and trailing whitespace are ignored, any run of whitespace can be
used as field separator (rules for Perl's "split(' ', $line)").  Keyed by
the first field, which is project name, i.e. path to repository GIT_DIR
relative to $projectroot.  Fields use modified URI encoding, defined in
RFC 3986, section 2.1 (Percent-Encoding), or rather "Query string encoding"
(see http://en.wikipedia.org/wiki/Query_string#URL_encoding), the difference
being that SP (' ') can be encoded as '+' (and therefore '+' has to be also
percent-encoded).  Reserved characters are: '%' (used for encoding), '+'
(can be used to encode SPACE), all whitespace characters as defined in Perl,
including SP, TAB and LF, (used to separate fields in a record).

Currently list of fields is
 * <repository path>  - path to repository GIT_DIR, relative to $projectroot
 * <repository owner> - displayed as repository owner, preferably full name,
                        or email, or both

You can additionally use $projects_list file to limit which repositories
are visible, and together with $strict_export to limit access to
repositories (see "Gitweb repositories" section in gitweb/INSTALL).


Per-repository gitweb configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can also configure individual repositories shown in gitweb by creating
file in the GIT_DIR of git repository, or by setting some repo configuration
variable (in GIT_DIR/config).

You can use the following files in repository:
 * README.html
   A .html file (HTML fragment) which is included on the gitweb project
   summary page inside <div> block element. You can use it for longer
   description of a project, to provide links (for example to project's
   homepage), etc. This is recognized only if XSS prevention is off
   ($prevent_xss is false); a way to include a readme safely when XSS
   prevention is on may be worked out in the future.
 * description (or gitweb.description)
   Short (shortened by default to 25 characters in the projects list page)
   single line description of a project (of a repository). Plain text file;
   HTML will be escaped. By default set to
     Unnamed repository; edit this file to name it for gitweb.
   from the template during repository creation. You can use the
   gitweb.description repo configuration variable, but the file takes
   precedence.
 * category (or gitweb.category)
   Singe line category of a project, used to group projects if
   $projects_list_group_categories is enabled. By default (file and
   configuration variable absent), uncategorized projects are put in
   the $project_list_default_category category. You can use the
   gitweb.category repo configuration variable, but the file takes
   precedence.
 * cloneurl (or multiple-valued gitweb.url)
   File with repository URL (used for clone and fetch), one per line.
   Displayed in the project summary page. You can use multiple-valued
   gitweb.url repository configuration variable for that, but the file
   takes precedence.
 * gitweb.owner
   You can use the gitweb.owner repository configuration variable to set
   repository's owner. It is displayed in the project list and summary
   page. If it's not set, filesystem directory's owner is used
   (via GECOS field / real name field from getpwiud(3)).
 * various gitweb.* config variables (in config)
   Read description of %feature hash for detailed list, and some
   descriptions.


Webserver configuration
-----------------------

If you want to have one URL for both gitweb and your http://
repositories, you can configure apache like this:

<VirtualHost *:80>
    ServerName		git.example.org
    DocumentRoot	/pub/git
    SetEnv			GITWEB_CONFIG	/etc/gitweb.conf

    # turning on mod rewrite
    RewriteEngine on

    # make the front page an internal rewrite to the gitweb script
    RewriteRule ^/$  /cgi-bin/gitweb.cgi

    # make access for "dumb clients" work
    RewriteRule ^/(.*\.git/(?!/?(HEAD|info|objects|refs)).*)?$ /cgi-bin/gitweb.cgi%{REQUEST_URI}  [L,PT]
</VirtualHost>

The above configuration expects your public repositories to live under
/pub/git and will serve them as http://git.domain.org/dir-under-pub-git,
both as cloneable GIT URL and as browseable gitweb interface.
If you then start your git-daemon with --base-path=/pub/git --export-all
then you can even use the git:// URL with exactly the same path.

Setting the environment variable GITWEB_CONFIG will tell gitweb to use
the named file (i.e. in this example /etc/gitweb.conf) as a
configuration for gitweb.  Perl variables defined in here will
override the defaults given at the head of the gitweb.perl (or
gitweb.cgi).  Look at the comments in that file for information on
which variables and what they mean.

If you use the rewrite rules from the example you'll likely also need
something like the following in your gitweb.conf (or gitweb_config.perl) file:

  @stylesheets = ("/some/absolute/path/gitweb.css");
  $my_uri = "/";
  $home_link = "/";


Webserver configuration with multiple projects' root
----------------------------------------------------

If you want to use gitweb with several project roots you can edit your apache
virtual host and gitweb.conf configuration files like this :

virtual host configuration :

<VirtualHost *:80>
    ServerName			git.example.org
    DocumentRoot		/pub/git
    SetEnv				GITWEB_CONFIG	/etc/gitweb.conf

    # turning on mod rewrite
    RewriteEngine on

    # make the front page an internal rewrite to the gitweb script
    RewriteRule ^/$ 		/cgi-bin/gitweb.cgi [QSA,L,PT]

    # look for a public_git folder in unix users' home
    # http://git.example.org/~<user>/
    RewriteRule ^/\~([^\/]+)(/|/gitweb.cgi)?$	/cgi-bin/gitweb.cgi [QSA,E=GITWEB_PROJECTROOT:/home/$1/public_git/,L,PT]

    # http://git.example.org/+<user>/
    #RewriteRule ^/\+([^\/]+)(/|/gitweb.cgi)?$	/cgi-bin/gitweb.cgi [QSA,E=GITWEB_PROJECTROOT:/home/$1/public_git/,L,PT]

    # http://git.example.org/user/<user>/
    #RewriteRule ^/user/([^\/]+)/(gitweb.cgi)?$	/cgi-bin/gitweb.cgi [QSA,E=GITWEB_PROJECTROOT:/home/$1/public_git/,L,PT]

    # defined list of project roots
    RewriteRule ^/scm(/|/gitweb.cgi)?$		/cgi-bin/gitweb.cgi [QSA,E=GITWEB_PROJECTROOT:/pub/scm/,L,PT]
    RewriteRule ^/var(/|/gitweb.cgi)?$		/cgi-bin/gitweb.cgi [QSA,E=GITWEB_PROJECTROOT:/var/git/,L,PT]

    # make access for "dumb clients" work
    RewriteRule ^/(.*\.git/(?!/?(HEAD|info|objects|refs)).*)?$ /cgi-bin/gitweb.cgi%{REQUEST_URI}  [L,PT]
</VirtualHost>

gitweb.conf configuration :

$projectroot = $ENV{'GITWEB_PROJECTROOT'} || "/pub/git";

These configurations enable two things. First, each unix user (<user>) of the
server will be able to browse through gitweb git repositories found in
~/public_git/ with the following url : http://git.example.org/~<user>/

If you do not want this feature on your server just remove the second rewrite rule.

If you already use mod_userdir in your virtual host or you don't want to use
the '~' as first character just comment or remove the second rewrite rule and
uncomment one of the following according to what you want.

Second, repositories found in /pub/scm/ and /var/git/ will be accesible
through http://git.example.org/scm/ and http://git.example.org/var/.
You can add as many project roots as you want by adding rewrite rules like the
third and the fourth.


PATH_INFO usage
-----------------------
If you enable PATH_INFO usage in gitweb by putting

   $feature{'pathinfo'}{'default'} = [1];

in your gitweb.conf, it is possible to set up your server so that it
consumes and produces URLs in the form

http://git.example.com/project.git/shortlog/sometag

by using a configuration such as the following, that assumes that
/var/www/gitweb is the DocumentRoot of your webserver, and that it
contains the gitweb.cgi script and complementary static files
(stylesheet, favicon):

<VirtualHost *:80>
	ServerAlias git.example.com

	DocumentRoot /var/www/gitweb

	<Directory /var/www/gitweb>
		Options ExecCGI
		AddHandler cgi-script cgi

		DirectoryIndex gitweb.cgi

		RewriteEngine On
		RewriteCond %{REQUEST_FILENAME} !-f
		RewriteCond %{REQUEST_FILENAME} !-d
		RewriteRule ^.* /gitweb.cgi/$0 [L,PT]
	</Directory>
</VirtualHost>

The rewrite rule guarantees that existing static files will be properly
served, whereas any other URL will be passed to gitweb as PATH_INFO
parameter.

Notice that in this case you don't need special settings for
@stylesheets, $my_uri and $home_link, but you lose "dumb client" access
to your project .git dirs. A possible workaround for the latter is the
following: in your project root dir (e.g. /pub/git) have the projects
named without a .git extension (e.g. /pub/git/project instead of
/pub/git/project.git) and configure Apache as follows:

<VirtualHost *:80>
	ServerAlias git.example.com

	DocumentRoot /var/www/gitweb

	AliasMatch ^(/.*?)(\.git)(/.*)?$ /pub/git$1$3
	<Directory /var/www/gitweb>
		Options ExecCGI
		AddHandler cgi-script cgi

		DirectoryIndex gitweb.cgi

		RewriteEngine On
		RewriteCond %{REQUEST_FILENAME} !-f
		RewriteCond %{REQUEST_FILENAME} !-d
		RewriteRule ^.* /gitweb.cgi/$0 [L,PT]
	</Directory>
</VirtualHost>

The additional AliasMatch makes it so that

http://git.example.com/project.git

will give raw access to the project's git dir (so that the project can
be cloned), while

http://git.example.com/project

will provide human-friendly gitweb access.

This solution is not 100% bulletproof, in the sense that if some project
has a named ref (branch, tag) starting with 'git/', then paths such as

http://git.example.com/project/command/abranch..git/abranch

will fail with a 404 error.



Originally written by:
  Kay Sievers <kay.sievers@vrfy.org>

Any comment/question/concern to:
  Git mailing list <git@vger.kernel.org>