It is needed only to escape attributes of handcrafted HTML elements,
and not those generated using CGI.pm subroutines / methods for HTML
generation.
While at it, add esc_url and esc_html where needed, and prefer to use
CGI.pm HTML generating methods than handcrafted HTML code. Most of
those are probably unnecessary (could be exploited only by person with
write access to gitweb config, or at least access to the repository).
This fixes CVE-2010-3906
Reported-by: Emanuele Gentili <e.gentili@tigersecurity.it>
Helped-by: John 'Warthog9' Hawley <warthog9@kernel.org>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.6.3:
Fix overridable written with an extra 'e'
Documentation: git-archive: mark --format as optional in summary
Round-down years in "years+months" relative date view
* maint-1.6.2:
Fix overridable written with an extra 'e'
Documentation: git-archive: mark --format as optional in summary
Round-down years in "years+months" relative date view
Conflicts:
Documentation/git-archive.txt
Call to_utf8 when parsing author and committer names, otherwise they will appear
with bad encoding if they written by using chop_and_escape_str.
Signed-off-by: Zoltán Füzesi <zfuzesi@eaglet.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-scm.com is now the "official" Git project page, having taken over
from git.or.cz, so update the default link accordingly. This saves a
redirect when people hit git.or.cz.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* gb/gitweb-avatar:
gitweb: add empty alt text to avatar img
gitweb: picon avatar provider
gitweb: gravatar url cache
gitweb: (gr)avatar support
gitweb: use git_print_authorship_rows in 'tag' view too
gitweb: uniform author info for commit and commitdiff
gitweb: refactor author name insertion
The empty alt text optimizes screen estate in text-only browsers.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simple implementation of picon that only relies on the indiana.edu
database.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Views which contain many occurrences of the same email address (e.g.
shortlog view) benefit from not having to recalculate the MD5 of the
email address every time.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce avatar support: the feature adds the appropriate img tag next
to author and committer in commit(diff), history, shortlog, log and tag
views. Multiple avatar providers are possible, but only gravatar is
implemented at the moment.
Gravatar support depends on Digest::MD5, which is a core package since
Perl 5.8. If gravatars are activated but Digest::MD5 cannot be found,
the feature will be automatically disabled.
No avatar provider is selected by default, except in the t9500 test.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_tag must be adapted to use the hash keys expected by
git_print_authorship_rows. This is not a problem since git_tag is the
only user of this sub.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch from 'log'-like layout
A U Thor <email@example.com> [date time]
to 'commit'-like layout
author A U Thor <email@example.com>
date time
committer C O Mitter <other.email@example.com>
committer date time
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Collect all author display code in appropriate functions, making it
easier to extend these functions on the CGI side.
We also move some of the presentation code from hard-coded HTML to CSS,
for easier customization.
A side effect of the refactoring is that now localtime is always
displayed with the 'at night' warning.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When combining "dumb client" and human-friendly access by using the
'.git' extension to switch between the two, make sure the AliasMatch
covers the entire request. Without a full match, a request for
http://git.example.com/project/shortlog/branch..gitsomething
would result in a 404 because the server would try to access the
the project 'project/shortlog/branch.'
The solution is still not bulletproof, so document the possible failing
case.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/gitweb-cleanup:
gitweb: Remove unused $hash_base parameter from normalize_link_target
gitweb: Simplify snapshot format detection logic in evaluate_path_info
gitweb: Use capturing parentheses only when you intend to capture
gitweb: Replace wrongly added tabs with spaces
gitweb: Use block form of map/grep in a few cases more
gitweb: Always use three argument form of open
gitweb: Always use three argument form of open
gitweb: Do not use bareword filehandles
Replace control characters with question mark '?' (like in
chop_and_esc_str).
A little background: some web browsers turn on strict (and
unforgiving) XML validating mode for XHTML documents served using
application/xhtml+xml content type. This means among others that
control characters are forbidden to appear in gitweb output.
CGI.pm does by default slight escaping (using simple_escape subroutine
from CGI::Util) of all _attribute_ values (depending on the value of
autoEscape, by default on). This escaping, at least in CGI.pm version
3.10 (most current version at CPAN is 3.43), is minimal: only '"',
'&', '<' and '>' are escaped using named HTML entity references
(", &, < and > respectively). But simple_escape does
not do escaping of control characters such as ^X which are invalid in
XHTML (in strict mode).
If by some accident commit message do contain some control character
in first 50 characters (more or less) of first line of commit message,
and this line is longer than 50 characters (so gitweb shortens it for
display), then gitweb would put this control character in title
attribute (and CGI.pm would not remove them). The tag _contents_ is
safe because it is escaped using esc_html() explicitly, and it
replaces control characters by their printable representation.
While at it: chop_and_escape_str doesn't need capturing group.
Noticed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
...since it was decided for normalize_link_target to only mangle
pathname, and do not try to check if target is present in $hash_base
tree, for performance reasons.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This issue was caught by perlcritic in harsh severity level noticing
that catch variable was used outside conditional thanks to the
Perl::Critic::Policy::RegularExpressions::ProhibitCaptureWithoutTest
policy. See "Perl Best Practices", chapter 12. Regular Expressions,
section 12.15. Captured Values:
Pattern matches that fail never assign anything to $1, $2, etc.,
nor do they leave those variables undefined. After an unsuccessful
pattern match, the numeric capture variables remain exactly as they
were before the match was attempted.
New version is in my opinion much easier to understand; previous
version worked correctly due to the fact that we returned from loop
on first found match.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Acked-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-capturing groups are useful because they have better runtime
performance and do not copy strings to the magic global capture
variables.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In two places there was hard tab character instead of space.
Fix this.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use block form of 'grep' i.e. 'grep {BLOCK} LIST' rather than
'grep(EXPR, LIST)' in filter_snapshot_fmts subroutine. This makes
code more readable, as expression is rather long, and statement above
there is 'map' with very similar expression also in the block form.
Remove unnecessary and misleading parentheses around block form 'map'
arguments in quote_command subroutine.
The inner "map" in format_snapshot_links was left alone, as it is not
clear whether adding parentheses or changing it into block form would
improve readibility and clarity of this code.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From 94638fb6edf3ea693228c680a6a30271ccd77522 Mon Sep 17 00:00:00 2001
From: Jakub Narebski <jnareb@gmail.com>
Date: Mon, 11 May 2009 03:25:55 +0200
Subject: [PATCH] gitweb: Localize magic variable $/
Instead of undefining and then restoring magic variable $/ (input
record separator) for 'slurp mode', localize it.
While at it, state explicitely that "local $/;" makes it undefined, by
using explicit "local $/ = undef;".
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most cases (except insert_file() subroutine) we used old two argument
form of 'open' to open files for reading. This can cause subtle bugs when
$projectroot or $projects_list file starts with mode characters ('>', '<',
'+<', '|', etc.) or with leading whitespace; and also when $projects_list
file or $mimetypes_file or ctags files end with trailing whitespace or '|'.
Additionally it is also more clear to explicitly state that we open those
files for reading.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb: Do not use bareword filehandles
The script was using bareword filehandles. This is considered a bad
practice so they have been changed to indirect filehandles.
Changes touch git_get_project_ctags and mimetype_guess_file;
while at it rename local variable from $mime to $mimetype (in
mimetype_guess_file) to better reflect its value (its contents).
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use of function prototypes is considered bad practice in Perl. The
ones used here didn't accomplish anything anyhow, so they've been
removed.
>From perlsub(1):
[...] the intent of this feature [prototypes] is primarily to let
you define subroutines that work like built-in functions [...]
you can generate new syntax with it [...]
We don't want to have subroutines behaving exactly like built-in
functions, we don't want to define new syntax / syntactic sugar, so
prototypes in gitweb are not needed... and they can have unintended
consequences.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the detection of the requested snapshot format, which failed for
PATH_INFO URLs since the references to the hashes which describe the
supported snapshot formats weren't dereferenced appropriately.
Signed-off-by: Holger Weiß <holger@zedat.fu-berlin.de>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current implementation only hyperlinks the first hash on
a given line of the commit message. It seems sensible to
highlight all of them if there are multiple, and it seems
plausible that there would be multiple even with a tidy line
length limit, because they can be abbreviated as short as 8
characters.
Benchmark:
I wanted to make sure that using the 'e' switch to the Perl regex
wasn't going to kill performance, since this is called once per commit
message line displayed.
In all three A/B scenarios I tried, the A and B yielded the same
results within 2%, where A is the version of code before this patch
and B is the version after.
1: View a commit message containing the last 1000 commit hashes
2: View a commit message containing 1000 lines of 40 dots to avoid
hyperlinking at the same message length
3: View a short merge commit message with a few lines of text and
no hashes
All were run in CGI mode on my sub-production hardware on a recent
clone of git.git. Numbers are the average of 10 reqests per second
with the first request discarded, since I expect this change to affect
primarily CPU usage. Measured with ApacheBench.
Note that the web page rendered was the same; while the new code
supports multiple hashes per line, there was at most one per line.
The primary purpose of scenarios 2 and 3 were to verify that the
addition of 1000 commit messages had an impact on how much of the time
was spent rendering commit messages. They were all within 2% of 0.80
requests per second (much faster).
So I think the patch has no noticeable effect on performance.
Signed-off-by: Marcel M. Cary <marcel@oak.homeunix.org>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a feature like "blame" is permitted to be overridden in the
repository configuration but it is not actually set in the repository,
a warning is emitted due to the undefined value of the repository
configuration, even though it's a perfectly normal condition.
Emitting warning is grounds for test failure in the gitweb test
script.
This error was caused by rewrite of git_get_project_config from using
"git config [<type>] <name>" for each individual configuration
variable checked to parsing "git config --list --null" output in
commit b201927 (gitweb: Read repo config using 'git config -z -l').
Earlier version of git_get_project_config was returning empty string
if variable do not exist in config; newer version is meant to return
undef in this case, therefore change in feature_bool was needed.
Additionally config_to_* subroutines were meant to be invoked only if
configuration variable exists; therefore we added early return to
git_get_project_config: it now returns no value if variable does not
exists in config. Otherwise config_to_* subroutines (config_to_bool
in paryicular) wouldn't be able to distinguish between the case where
variable does not exist and the case where variable doesn't have value
(the "[section] noval" case, which evaluates to true for boolean).
While at it fix bug in config_to_bool, where checking if $val is
defined (if config variable has value) was done _after_ stripping
leading and trailing whitespace, which lead to 'Use of uninitialized
value' warning.
Add test case for features overridable but not overriden in repo
config, and case for no value boolean configuration variable.
Signed-off-by: Marcel M. Cary <marcel@oak.homeunix.org>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CGI::url() has some issues when rebuilding the script URL if the script
is a DirectoryIndex.
One of these issue is the inability to strip PATH_INFO, which is why
we had to do it ourselves.
Another issue is that the resulting URL cannot be used for the <base>
tag: it works if we're the DirectoryIndex at the root level, but not
otherwise.
We fix this by building the proper base URL ourselves, and improve the
comment about the need to strip PATH_INFO manually while we're at it.
Additionally t/t9500-gitweb-standalone-no-errors.sh had to be modified
to set SCRIPT_NAME variable (CGI standard states that it MUST be set,
and now gitweb uses it if PATH_INFO is not empty, as is the case for
some of tests in t9500).
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a gitweb configuration variable $prevent_xss that disables features
to prevent content in repositories from launching cross-site scripting
(XSS) attacks in the gitweb domain. Currently, this option makes gitweb
ignore README.html (a better solution may be worked out in the future)
and serve a blob_plain file of an untrusted type with
"Content-Disposition: attachment", which tells the browser not to show
the file at its original URL.
The XSS prevention is currently off by default.
Signed-off-by: Matt McCutchen <matt@mattmccutchen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make SHA-1 regexp to be turned into hyperlink (the SHA-1 committag)
to match word boundary at the beginning and the end. This way we
reduce number of false matches, for example we now don't match
0x74a5cd01 which is hex decimal (for example memory address),
but is not SHA-1.
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One had to configure gitweb for it to find static files (stylesheets,
images) when using path_info URLs. Now that it is not necessary
thanks to adding BASE element to HTML head if needed, update README to
reflect this fact.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document some possible Apache configurations when the path_info feature
is enabled in gitweb.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Gitweb links to a number of static files such as CSS stylesheets,
favicon or the git logo. When, such as with the default Makefile, the
paths to these files are relative (i.e. doesn't start with a "/"), the
files become inaccessible in any view other tha project list and summary
page if gitweb is invoked with a non-empty PATH_INFO.
Fix this by adding a <base> element pointing to the script's own URL,
which ensure that all relative paths will be resolved correctly.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Offering Last-modified header for feeds is only half the work, even if
we bail out early on HEAD requests. We should also check that same date
against If-modified-since, and bail out early with 304 Not Modified if
that's the case.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The last-modified time header added by RSS to increase cache hits from
readers should be set to the date the repository was last modified. The
author time in this respect is not a good guess because the last commit
might come from a oldish patch.
Use the committer time for the last-modified header to ensure a more
correct guess of the last time the repository was modified.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The RSS 2.0 specifications defines not one but _two_ dates for its
channel element! Woohoo! Luckily, it seems that consensus seems to be
that if both are present they should be equal, except for some very
obscure and discouraged cases. Since lastBuildDate would make more sense
for us and pubDate seems to be the most commonly used, we defined both
and make them equal.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The RSS 2.0 specification allows an optional managingEditor tag for the
channel, containing the "email address for person responsible for editorial
content", which is basically the project owner.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add <generator> tag to RSS and Atom feed. Versioning info (gitweb/git
core versions, separated by a literal slash) is stored in the
appropriate attribute for the Atom feed, and in the tag content for the
RSS feed.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define the channel image for the rss feed when the logo or favicon are
defined, preferring the former to the latter. As suggested in the RSS
2.0 specifications, the image's title and link as set to the same as the
channel's.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>