Commit Graph

828 Commits

Author SHA1 Message Date
Jeff King
0f0ecf68b3 gitweb: escape html in rss title
The title of an RSS feed is generated from many components,
including the filename provided as a query parameter, but we
failed to quote it.  Besides showing the wrong output, this
is a vector for XSS attacks.

Signed-off-by: Jeff King <peff@peff.net>
2012-11-12 16:34:53 -05:00
Richard Hubbell
048b399192 gitweb.perl: fix %highlight_ext mappings
When commit 592ea41 refactored the list of extensions for
syntax highlighting, it failed to take into account perl's
operator precedence within lists. As a result, we end up
creating a dictionary of one-to-one elements when the intent
was to map mutliple related types to one main type (e.g.,
bash, ksh, zsh, and sh should all map to sh since they share
similar syntax, but we ended up just mapping "bash" to
"bash" and so forth).

This patch adds parentheses to make the mapping as the
original change intended. It also reorganizes the list to
keep mapped extensions together.

Signed-off-by: Richard Hubbell <richard_hubbe11@lavabit.com>
Signed-off-by: Jeff King <peff@peff.net>
2012-11-08 12:59:23 -05:00
Dylan Alex Simon
debf29dc29 gitweb.cgi: fix "comitter_tz" typo in feed
gitweb's feeds sometimes contained committer timestamps in the wrong timezone
due to a misspelling.

Signed-off-by: Dylan Simon <dylan@dylex.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 08:43:12 -07:00
Junio C Hamano
e3f26752b5 Merge branch 'maint-1.7.11' into maint
* maint-1.7.11:
  Almost 1.7.11.6
  gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO
  rebase -i: use full onto sha1 in reflog
  sh-setup: protect from exported IFS
  receive-pack: do not leak output from auto-gc to standard output
  t/t5400: demonstrate breakage caused by informational message from prune
  setup: clarify error messages for file/revisions ambiguity
  send-email: improve RFC2047 quote parsing
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-09-10 15:31:06 -07:00
Junio C Hamano
3d4003bdc7 Merge branch 'js/gitweb-path-info-unquote' into maint-1.7.11
"gitweb" when used with PATH_INFO failed to notice directories with
SP (and other characters that need URL-style quoting) in them.

* js/gitweb-path-info-unquote:
  gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO
2012-09-10 15:23:46 -07:00
Jay Soffian
cacfc09ba8 gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO
When gitweb is used as a DirectoryIndex, it attempts to strip
PATH_INFO on its own, as $cgi->url() fails to do so.

However, it fails to account for the fact that PATH_INFO has
already been URL-decoded by the web server, but the value
returned by $cgi->url() has not been. This causes the stripping
to fail whenever the URL contains encoded characters.

To see this in action, setup gitweb as a DirectoryIndex and
then use it on a repository with a directory containing a
space in the name. Navigate to tree view, examine the gitweb
generated html and you'll see a link such as:

  <a href="/test.git/tree/HEAD:/directory with spaces">directory with spaces</a>

When clicked on, the browser will URL-encode this link, giving
a $cgi->url() of the form:

   /test.git/tree/HEAD:/directory%20with%20spaces

While PATH_INFO is:

   /test.git/tree/HEAD:/directory with spaces

Fix this by calling unescape() on both $my_url and $my_uri before
stripping PATH_INFO from them.

Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-15 11:47:43 -07:00
Junio C Hamano
6da9ded763 Merge branch 'nk/maint-gitweb-log-by-lines'
Teach gitweb to pay attention to various forms of credits that are
similar to "Signed-off-by:" lines.

* nk/maint-gitweb-log-by-lines:
  gitweb: Add support to Link: tag
  gitweb: Handle other types of tag in git_print_log
  gitweb: Cleanup git_print_log()
2012-07-23 20:55:07 -07:00
Namhyung Kim
66c857e1ae gitweb: Add support to Link: tag
The tip tree is the one of major subsystem tree in the
Linux kernel project. On the tip tree, the Link: (or
similar Buglink:) tag is used for tracking the original
discussion or context. Since it's ususally in the S-o-b
area, it'd be better using same style with others.

Also as it tends to contain a message-id sent from git
send-email, a part of the line would set a wrong hyperlink
like [1]. Fix it by not using format_log_line_html().

[1] git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commit;h=08942f6d5d992e9486b07653fd87ea8182a22fa0

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-05 15:29:12 -07:00
Namhyung Kim
3d1110aa72 gitweb: Handle other types of tag in git_print_log
There are many types of tags used in S-o-b area [1].
Update the regex to handle them properly. It requires
the tag should be started by a capital letter and ended
by '-by: ' or '-By: '. The only exception is 'Cc: '.

[1] http://lwn.net/Articles/503829/

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-05 15:23:35 -07:00
Namhyung Kim
5a45c0cafe gitweb: Cleanup git_print_log()
When we see a signed-off-by line (and its friends), we set $signoff
to true, but then we process the next line after we are done without
giving control to the rest of the loop.  And when the line we saw is
not a signed-off-by line, we reset $signoff to false before running
the remainder of the loop.

Hence, the check for $signoff that attempts to remove an extra empty
line between two signed-off-by line was not doing anything useful.

Rename $empty to a more explicit name $skip_blank_line to tell us to
skip a blank line when we see one, set it after we see and emit a
blank line (to avoid showing more than one empty lines in a raw) or
after we handle a signed-off-by line (to avoid empty lines after
such a line), to fix this bug, and get rid of the $signoff variable
that is not useful.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-05 15:22:45 -07:00
Junio C Hamano
55375e9473 Merge branch 'kk/gitweb-omit-expensive'
"gitweb" learned to optionally omit output of fields that are expensive
to generate.

By Kacper Kornet
* kk/gitweb-omit-expensive:
  gitweb: Option to not display information about owner
  gitweb: Option to omit column with time of the last change
2012-04-29 17:52:00 -07:00
Junio C Hamano
070d5271e4 Merge branch 'kk/maint-gitweb-missing-owner'
By Kacper Kornet
* kk/maint-gitweb-missing-owner:
  gitweb: Don't set owner if got empty value from projects.list
2012-04-29 17:51:56 -07:00
Kacper Kornet
0ebe7827b6 gitweb: Option to not display information about owner
In some setups the repository owner is not a well defined concept
and administrator can prefer it to be not shown. This commit add
and an option that enable to reach this effect.

Signed-off-by: Kacper Kornet <draenog@pld-linux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-26 11:24:40 -07:00
Kacper Kornet
5710be46d8 gitweb: Option to omit column with time of the last change
Generating information about last change for a large number of git
repositories can be very time consuming. This commit add an option to
omit 'Last Change' column when presenting the list of repositories.

Signed-off-by: Kacper Kornet <draenog@pld-linux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-25 16:42:34 -07:00
Kacper Kornet
75e0dffef0 gitweb: Don't set owner if got empty value from projects.list
Prevent setting owner to an empty value if it is not specified in
projects.list file. Otherwise it stops retrieving information about the
owner from other files.

Signed-off-by: Kacper Kornet <draenog@pld-linux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-25 16:41:37 -07:00
Junio C Hamano
c6cfa5bbbb Merge branch 'mk/gitweb-diff-hl'
"gitweb" learns to highlight the patch it outputs even more.

By Michał Kiedrowicz (7) and Jakub Narębski (1)
* mk/gitweb-diff-hl:
  gitweb: Refinement highlightning in combined diffs
  gitweb: Highlight interesting parts of diff
  gitweb: Push formatting diff lines to print_diff_chunk()
  gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
  gitweb: Extract print_sidebyside_diff_lines()
  gitweb: Pass esc_html_hl_regions() options to esc_html()
  gitweb: esc_html_hl_regions(): Don't create empty <span> elements
  gitweb: Use descriptive names in esc_html_hl_regions()
2012-04-24 14:41:01 -07:00
Junio C Hamano
0f3ddd4a3a Merge branch 'wk/gitweb-snapshot-use-if-modified-since'
Makes 'snapshot' request to "gitweb" honor If-Modified-Since: header,
based on the commit date.

By W. Trevor King
* wk/gitweb-snapshot-use-if-modified-since:
  gitweb: add If-Modified-Since handling to git_snapshot().
  gitweb: refactor If-Modified-Since handling
  gitweb: add `status` headers to git_feed() responses.
2012-04-16 12:42:48 -07:00
Michał Kiedrowicz
51ef7a6e80 gitweb: Refinement highlightning in combined diffs
The highlightning of combined diffs is currently disabled.  This is
because output from a combined diff is much harder to highlight because
it is not obvious which removed and added lines should be compared.

Current code requires that the number of added lines is equal to the
number of removed lines and only skips first +/- character, treating
second +/- as a line content, Thus, it is not possible to simply use
existing algorithm unchanged for combined diffs.

Let's start with a simple case: only highlight changes that come from
one parent, i.e. when every removed line has a corresponding added line
for the same parent.  This way the highlightning cannot get wrong. For
example, following diffs would be highlighted:

	- removed line for first parent
	+ added line for first parent
	  context line
	 -removed line for second parent
	 +added line for second parent

or

	- removed line for first parent
	 -removed line for second parent
	+ added line for first parent
	 +added line for second parent

but following output will not:

	- removed line for first parent
	 -removed line for second parent
	 +added line for second parent
	++added line for both parents

In other words, we require that pattern of '-'-es in pre-image matches
pattern of '+'-es in post-image.

Further changes may introduce more intelligent approach that better
handles combined diffs.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:02 -07:00
Michał Kiedrowicz
5fb6ddf67a gitweb: Highlight interesting parts of diff
Reading diff output is sometimes very hard, even if it's colored,
especially if lines differ only in few characters.  This is often true
when a commit fixes a typo or renames some variables or functions.

This commit teaches gitweb to highlight characters that are different
between old and new line with a light green/red background.  This should
work in the similar manner as in Trac or GitHub.

The algorithm that compares lines is based on contrib/diff-highlight.
Basically, it works by determining common prefix/suffix of corresponding
lines and highlightning only the middle part of lines.  For more
information, see contrib/diff-highlight/README.

Combined diffs are not supported but a following commit will change it.

Since we need to pass esc_html()'ed or esc_html_hl_regions()'ed lines to
format_diff_lines(), so it was taught to accept preformatted lines
passed as a reference.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:02 -07:00
Michał Kiedrowicz
f4a8102650 gitweb: Push formatting diff lines to print_diff_chunk()
Now lines are formatted closer to place where we actually use HTML
formatted output.

This means that we put raw lines in the @chunk accumulator, rather than
formatted lines.  Because we still need to know class (type) of line
when accumulating data to post-process and print, process_diff_line()
subroutine was retired and replaced by diff_line_class() used in
git_patchset_body() and new restructured format_diff_line() used in
print_diff_chunk().

As a side effect, we have to pass \%from and \%to down to callstack.

This is a preparation patch for diff refinement highlightning. It's not
meant to change gitweb output.

[jn: wrote commit message]

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:02 -07:00
Michał Kiedrowicz
44185f93f4 gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
This renames print_sidebyside_diff_chunk() to print_diff_chunk() and
makes use of it for both side-by-side and inline diffs.  Now diff lines
are always accumulated before they are printed.  This opens the
possibility to preprocess diff output before it's printed, which is
needed for diff refinement highlightning (implemented in incoming
patches).

If print_diff_chunk() was left as is, the new function
print_inline_diff_lines() could reorder diff lines.  It first prints all
context lines, then all removed lines and finally all added lines.  If
the diff output consisted of mixed added and removed lines, gitweb would
reorder these lines.  This is true for combined diff output, for
example:

	 - removed line for first parent
	 + added line for first parent
	  -removed line for second parent
	 ++added line for both parents

would be rendered as:

	- removed line for first parent
	 -removed line for second parent
	+ added line for first parent
	++added line for both parents

To prevent gitweb from reordering lines, print_diff_chunk() calls
print_diff_lines() as soon as it detects that both added and removed
lines are present and there was a class change, and at the end of chunk.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:01 -07:00
Michał Kiedrowicz
d21102c9ff gitweb: Extract print_sidebyside_diff_lines()
Currently, print_sidebyside_diff_chunk() does two things: it
accumulates diff lines and prints them.  Accumulation may be used to
perform additional operations on diff lines, so it makes sense to split
these two things.  Thus, whole code that formats and prints diff lines
in the 'side-by-side' manner is moved out of print_sidebyside_diff_chunk()
to a separate subroutine and two conditions that control printing
diff liens are merged.

Thanks to that, we can easily (in later patches) replace call to that
subroutine with a call to more generic print_diff_lines() that will
control whether 'inline' or 'side-by-side' diff should be printed.

As a side effect, context lines are printed just before printing added
and removed lines, and at the end of chunk (previously, they were
printed immediately on the class change).  However, this doesn't change
gitweb output.

The outcome of this patch is that print_sidebyside_diff_chunk() is now
much shorter and easier to read.

While at it, drop the '# assume that it is change' comment.  According
to Jakub Narębski:

	What I meant here when I was writing it that they are lines that
	changed between two versions, like '!' in original (not unified)
	context format.

	We can omit this comment.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:01 -07:00
Jakub Narębski
9768a9d884 gitweb: Pass esc_html_hl_regions() options to esc_html()
With this change, esc_html_hl_regions() accepts options and passes them
down to esc_html().  This may be needed if a caller wants to pass
-nbsp=>1 to esc_html().

The idea and implementation example of this change was described in
337da8d2 (gitweb: Introduce esc_html_match_hl and esc_html_hl_regions,
2012-02-27).  While other suggestions may be more useful in some cases,
there is no need to implement them at the moment.  The
esc_html_hl_regions() interface may be changed later if it's needed.

[mk: extracted from larger patch and wrote commit message]

Signed-off-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:01 -07:00
Michał Kiedrowicz
cbbea3dfc1 gitweb: esc_html_hl_regions(): Don't create empty <span> elements
If $end is equal to or less than $begin, esc_html_hl_regions()
generates an empty <span> element.  It normally shouldn't be visible in
the web browser, but it doesn't look good when looking at page source.
It also minimally increases generated page size for no special reason.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:01 -07:00
Michał Kiedrowicz
ce61fb968f gitweb: Use descriptive names in esc_html_hl_regions()
The $s->[0] and $s->[1] variables look a bit cryptic.  Let's rename them
to $begin and $end so that it's clear what they do.

Suggested-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 14:26:01 -07:00
Sebastian Pipping
cc999a3a08 gitweb: Fix unintended "--no-merges" for regular Atom feed
The print_feed_meta() subroutine generates links for feeds with and
without merges, in RSS and Atom formats.  However because %href_params
was not properly reset, it generated links with "--no-merges" for all
except the very first link.

Before:
<link rel="alternate" title="[..] - Atom feed" href="/?p=.git;a=atom;opt=--no-merges" type="application/atom+xml" />
<link rel="alternate" title="[..] - Atom feed (no merges)" href="/?p=.git;a=atom;opt=--no-merges" type="application/atom+xml" />

After:
<link rel="alternate" title="[..] - Atom feed" href="/?p=.git;a=atom" type="application/atom+xml" />
<link rel="alternate" title="[..] - Atom feed (no merges)" href="/?p=.git;a=atom;opt=--no-merges" type="application/atom+xml" />

Signed-off-by: Sebastian Pipping <sebastian@pipping.org>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-11 09:51:00 -07:00
W. Trevor King
8745db63ca gitweb: add If-Modified-Since handling to git_snapshot().
Because snapshots can be large, you can save some bandwidth by
supporting caching via If-Modified-Since.  This patch adds support for
the i-m-s request to git_snapshot() if the request is a commit.
Requests for snapshots of trees, which lack well defined timestamps,
are still handled as they were before.

Signed-off-by: W Trevor King <wking@drexel.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-30 09:09:59 -07:00
W. Trevor King
b7d565ea4c gitweb: refactor If-Modified-Since handling
The current gitweb only generates Last-Modified and handles
If-Modified-Since headers for the git_feed action.  This patch breaks
the Last-Modified and If-Modified-Since handling code out from
git_feed into a new function exit_if_unmodified_since.  This makes the
code easy to reuse for other actions.

Only gitweb actions which can easily calculate a modification time
should use exit_if_unmodified_since, as the goal is to balance local
processing time vs. upload bandwidth.

Signed-off-by: W Trevor King <wking@drexel.edu>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-30 09:08:52 -07:00
W. Trevor King
e1c3643ff7 gitweb: add status headers to git_feed() responses.
The git_feed() method was not setting a `Status` header unless it was
responding to an If-Modified-Since request with `304 Not Modified`.
Now, when it is serving successful responses, it sets status to `200
OK`.

Signed-off-by: W Trevor King <wking@drexel.edu>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-30 09:07:26 -07:00
Junio C Hamano
0f360763c0 Merge branch 'maint-1.7.8' into maint
* maint-1.7.8:
  t/Makefile: Use $(sort ...) explicitly where needed
  gitweb: Fix actionless dispatch for non-existent objects
  i18n of multi-line advice messages
2012-03-20 15:53:30 -07:00
Junio C Hamano
f629c233e6 Merge branch 'jn/maint-do-not-match-with-unsanitized-searchtext' into maint
"gitweb" did use quotemeta() to prepare search string when asked to
do a fixed-string project search, but did not use it by mistake and
used the user-supplied string instead.

By Jakub Narebski
* jn/maint-do-not-match-with-unsanitized-searchtext:
  gitweb: Fix fixed string (non-regexp) project search
2012-03-12 15:45:58 -07:00
Junio C Hamano
aa145bf6f1 Merge branch 'jn/maint-do-not-match-with-unsanitized-searchtext'
By Jakub Narebski
* jn/maint-do-not-match-with-unsanitized-searchtext:
  gitweb: Fix fixed string (non-regexp) project search

Conflicts:
	gitweb/gitweb.perl
2012-03-08 13:04:49 -08:00
Jakub Narebski
e65ceb61cd gitweb: Fix fixed string (non-regexp) project search
Use $search_regexp, where regex metacharacters are quoted, for
searching projects list, rather than $searchtext, which contains
original search term.

Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-06 14:48:24 -08:00
Junio C Hamano
6759f95d63 Merge branch 'jn/gitweb-hilite-regions'
* jn/gitweb-hilite-regions:
  gitweb: Highlight matched part of shortened project description
  gitweb: Highlight matched part of project description when searching projects
  gitweb: Highlight matched part of project name when searching projects
  gitweb: Introduce esc_html_match_hl and esc_html_hl_regions
2012-03-04 23:35:18 -08:00
Junio C Hamano
3ecd0c8b4d Merge branch 'jn/maint-gitweb-invalid-regexp' into maint
* jn/maint-gitweb-invalid-regexp:
  gitweb: Handle invalid regexp in regexp search
2012-03-04 22:17:47 -08:00
Jakub Narebski
b22939a286 gitweb: Fix passing parameters to git_project_search_form
The git_project_search_form() subroutine, introduced in a1e1b2d
(gitweb: improve usability of projects search form, 2012-01-31) didn't
get its arguments from caller correctly.  Gitweb worked correctly
thanks to sticky-ness of form fields in CGI.pm... but it make UTF-8
fix for project search not working.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-02 15:55:09 -08:00
Junio C Hamano
6a3a3db73f Merge branch 'jn/maint-gitweb-invalid-regexp'
* jn/maint-gitweb-invalid-regexp:
  gitweb: Handle invalid regexp in regexp search
2012-03-01 14:44:38 -08:00
Jakub Narebski
36612e4daf gitweb: Handle invalid regexp in regexp search
When using regexp search ('sr' parameter / $search_use_regexp variable
is true), check first that regexp is valid.

Without this patch we would get an error from Perl during search (if
searching is performed by gitweb), or highlighting matches substring
(if applicable), if user provided invalid regexp... which means broken
HTML, with error page (including HTTP headers) generated after gitweb
already produced some output.

Add test that illustrates such error: for example for regexp "*\.git"
we would get the following error:

  Quantifier follows nothing in regex; marked by <-- HERE in m/* <-- HERE \.git/
  at /var/www/cgi-bin/gitweb.cgi line 3084.

Reported-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-28 11:45:31 -08:00
Junio C Hamano
e22c522730 Merge branch 'jn/gitweb-unborn-head' into maint
* jn/gitweb-unborn-head:
  gitweb: Fix "heads" view when there is no current branch
2012-02-27 15:33:26 -08:00
Junio C Hamano
507fba2b98 Merge branch 'jn/gitweb-search-optim'
* jn/gitweb-search-optim:
  gitweb: Faster project search
  gitweb: Option for filling only specified info in fill_project_list_info
  gitweb: Refactor checking if part of project info need filling
2012-02-26 23:05:56 -08:00
Jakub Narebski
e607b79fb1 gitweb: Highlight matched part of shortened project description
Previous commit make gitweb use esc_html_match_hl() to mark match in
the _whole_ description of a project when searching projects.

This commit makes gitweb highlight match in _shortened_ description,
based on match in whole description, using esc_html_match_hl_chopped()
subroutine.

If match is in removed (chopped) part, even partially, then trailing
"... " is highlighted.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-26 22:02:58 -08:00
Jakub Narebski
5fb3cf2317 gitweb: Highlight matched part of project description when searching projects
Use esc_html_match_hl() from earlier commit to mark match in the
_whole_ description when searching projects.

Currently, with this commit, when searching projects there is always
shown full description of a project, and not a shortened one (like for
ordinary projects list view), even if the match is on project name and
not project description.  Because we always show full description of a
project, and not possibly shortened name, there is no need for having
full description on mouseover via title attribute.

Showing full description when there is match on it is useful to avoid
situation where match is in shortened, invisible part.  On the other
hand that makes project search different than projects list view; also
there can be problems with overly-long project descriptions.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-26 22:02:57 -08:00
Jakub Narebski
07a40062ae gitweb: Highlight matched part of project name when searching projects
Use esc_html_match_hl() introduced in previous commit to escape HTML
and mark match, using span element with 'match' class.  Currently only
the 'path' part (i.e. the project name) is highlighted; match might be
on the project description.  Highlighting match in description is left
for next commit.

The code makes use of the fact that defined $search_regexp means that
there was search going on.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-26 22:02:56 -08:00
Jakub Narebski
337da8d2b0 gitweb: Introduce esc_html_match_hl and esc_html_hl_regions
The esc_html_match_hl() subroutine added in this commit will be used
to highlight *all* matches of given regexp, using 'match' class.
Ultimately it is to be used in all match highlighting, starting
with project search, which does not have it yet.

It uses the esc_html_hl_regions() subroutine, which is meant to
highlight in a given string a list of regions (given as a list of
[ beg, end ] pairs of positions in string), using HTML <span> element
with given class.  It could probably be used in other places that
do highlighting of part of ready line, like highlighting of changes
in a diff (diff refinement highlighting).

Implementation and enhancement notes:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Currently esc_html_hl_regions() subroutine doesn't accept any
  parameters, like esc_html() does.  We might want for example to
  pass  nbsp=>1  to it.

  It can easily be done with the following code:

    my %opts = grep { ref($_) ne "ARRAY" } @sel;
    @sel     = grep { ref($_) eq "ARRAY" } @sel;

  This allow adding parameters after or before regions, e.g.:

    esc_html_hl_regions("foo bar", "mark", [ 0, 3 ], -nbsp => 1);

* esc_html_hl_regions() escapes like esc_html(); if we wanted to
  highlight with esc_path(), we could pass subroutine reference
  to now named esc_gen_hl_regions().

    esc_html_hl_regions("foo bar", "mark", \&esc_path, [ 0, 3 ]);

  Note that this way we can handle -nbsp=>1 case automatically,
  e.g.

    esc_html_hl_regions("foo bar", "mark",
                        sub { esc_html(@_, -nbsp=>1) },
                        [ 0, 3 ]);

* Alternate solution for highlighting region of a string would be to
  use the idea that strings are to be HTML-escaped, and references to
  scalars are HTML (like in the idea for generic committags).

  This would require modifying gitweb code or esc_html to get list of
  fragments, e.g.:

    esc_html(\'<span class="mark">', 'foo', \'</span>', ' bar',
             { -nbsp => 1 });

  or

    esc_html([\'<span class="mark">', 'foo', \'</span>', ' bar'],
             -nbsp=>1);

  esc_html_match_hl() could be then simple wrapper around "match
  formatter", e.g.

    esc_html([ render_match_hl($str, $regexp) ], -nbsp=>1);

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-26 22:02:54 -08:00
Jakub Narebski
07b257f940 gitweb: Faster project search
Before searching by some field the information we search for must be
filled in, but we do not have to fill other fields that are not
involved in the search.

To be able to request filling only specified fields,
fill_project_list_info() was enhanced in previous commit to take
additional parameters which specify part of projects info to fill.
This way we can limit doing expensive calculations (like running
git-for-each-ref to get 'age' / "Last changed" info) to doing those
only for projects which we will show as search results.

This commit actually uses this interface, changing gitweb code from
the following behavior

  fill all project info on all projects
  search projects

to behaving like this pseudocode

  fill search fields on all projects
  search projects
  fill all project info on search results

With this commit the number of git commands used to generate search
results is 2*<matched projects> + 1, and depends on number of matched
projects rather than number of all projects (all repositories).

Note: this is 'git for-each-ref' to find last activity, and 'git config'
for each project, and 'git --version' once.

Example performance improvements, for search that selects 2
repositories out of 12 in total:

* Before (warm cache):
  "This page took 0.867151 seconds  and 27 git commands to generate."

* After (warm cache):
  "This page took 0.673643 seconds  and 5 git commands to generate."

Now imagine that they are 5 repositories out of 5000, and cold or
trashed cache case.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-23 12:53:03 -08:00
Jakub Narebski
2e3291ae1d gitweb: Option for filling only specified info in fill_project_list_info
Enhance fill_project_list_info() subroutine to accept optional
parameters that specify which fields in project information needs to
be filled.  If none are specified then fill_project_list_info()
behaves as it used to, and ensure that all project info is filled.

This is in preparation of future lazy filling of project info in
project search and pagination of sorted list of projects.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-23 12:52:16 -08:00
Jakub Narebski
14b289bdf7 gitweb: Refactor checking if part of project info need filling
Extract the check if given keys (given parts) of project info needs to
be filled into project_info_needs_filling() subroutine.  It is for now
a thin wrapper around "!exists $project_info->{$key}".

Note that !defined was replaced by more correct !exists.

While at it uniquify treating of all project info, adding checks for
'age' field before running git_get_last_activity(), and also checking
for all keys filled in code protected by conditional, and not only
one.

The code now looks like this

  foreach my $project (@$project_list) {
  	if (given keys need to be filled) {
  		fill given keys
  	}
  	...
  }

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-23 12:49:37 -08:00
Junio C Hamano
5609586f65 Merge branch 'jn/gitweb-unborn-head'
* jn/gitweb-unborn-head:
  gitweb: Fix "heads" view when there is no current branch
2012-02-21 15:25:53 -08:00
Junio C Hamano
2c8fb23ac7 Merge branch 'maint'
* maint:
  Update draft release notes to 1.7.9.2
  gitweb: Fix 'grep' search for multiple matches in file
2012-02-20 00:14:17 -08:00
Jakub Narebski
fc8fcd27e6 gitweb: Fix 'grep' search for multiple matches in file
Commit ff7f218 (gitweb: Fix file links in "grep" search, 2012-01-05),
added $file_href variable, to reduce duplication and have the fix
applied in single place.

Unfortunately it made variable defined inside the loop, not taking into
account the fact that $file_href was set only if file changed.
Therefore for files with multiple matches $file_href was undefined for
second and subsequent matches.

Fix this bug by moving $file_href declaration outside loop.

Adds tests for almost all forms of sarch in gitweb, which were missing
from testuite.  Note that it only tests if there are no warnings, and
it doesn't check that gitweb finds what it should find.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-19 22:21:35 -08:00
Jakub Narebski
fd49e56af6 gitweb: Fix "heads" view when there is no current branch
In a repository whose HEAD points to an unborn branch with no commits,
"heads" view and "summary" view (which shows what is shown in "heads"
view) compared the object names of commits at the tip of branches with the
output from "git rev-parse HEAD", which caused comparison of a string with
undef and resulted in a warning in the server log.

This can happen if non-bare repository (with default 'master' branch)
is updated not via committing but by other means like push to it, or
Gerrit.  It can happen also just after running "git checkout --orphan
<new branch>" but before creating any new commit on this branch.

Rewrite the comparison so that it also works when $head points at nothing;
in such a case, no branch can be "the current branch", add a test for it.
While at it, rename local variable $head to $head_at, as it points to
current commit rather than current branch name (HEAD contents).

The code still incorrectly shows all branches that point at the same
commit as what HEAD points as "the current branch", even when HEAD is
detached. Fixing this bug is outside the scope of this patch.

Reported-by: Rajesh Boyapati
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-17 08:25:30 -08:00
Junio C Hamano
a49060324a Merge branch 'bl/gitweb-project-filter'
* bl/gitweb-project-filter:
  gitweb: Harden and improve $project_filter page title
2012-02-14 12:57:17 -08:00
Junio C Hamano
d1168b9033 Merge branch 'jn/gitweb-search-utf-8'
* jn/gitweb-search-utf-8:
  gitweb: Allow UTF-8 encoded CGI query parameters and path_info

Conflicts:
	gitweb/gitweb.perl
2012-02-12 22:43:24 -08:00
Jakub Narebski
f4212089c2 gitweb: Harden and improve $project_filter page title
Commit 19d2d23 (gitweb: add project_filter to limit project list
to a subdirectory, 2012-01-30) added also support for displaying
$project_filter, if present, in page title.

Unfortunately it forgot to treat $project_filter as path, and escape
it using esc_path(), like it is done for $filename.

Also, it was not obvious that "$site_name - $project_filter" is about
project filtering: use "$site_name - projects in '$project_filter'".

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-12 22:11:31 -08:00
Junio C Hamano
715d130460 Merge branch 'bl/gitweb-project-filter'
* bl/gitweb-project-filter:
  gitweb: Make project search respect project_filter
  gitweb: improve usability of projects search form
  gitweb: place links to parent directories in page header
  gitweb: show active project_filter in project_list page header
  gitweb: limit links to alternate forms of project_list to active project_filter
  gitweb: add project_filter to limit project list to a subdirectory
  gitweb: prepare git_get_projects_list for use outside 'forks'.
  gitweb: move hard coded .git suffix out of git_get_projects_list
2012-02-07 12:57:05 -08:00
Jakub Narebski
84d9e2d50c gitweb: Allow UTF-8 encoded CGI query parameters and path_info
Gitweb forgot to turn query parameters into UTF-8. This results in a bug
that one cannot search for a string with characters outside US-ASCII.  For
example searching for "Michał Kiedrowicz" (containing letter 'ł' - LATIN
SMALL LETTER L WITH STROKE, with Unicode codepoint U+0142, represented
with 0xc5 0x82 bytes in UTF-8 and percent-encoded as %C5%82) result in the
following incorrect data in search field

	MichaÅ\202 Kiedrowicz

This is caused by CGI by default treating '0xc5 0x82' bytes as two
characters in Perl legacy encoding latin-1 (iso-8859-1), because 's'
query parameter is not processed explicitly as UTF-8 encoded string.

The solution used here follows "Using Unicode in a Perl CGI script"
article on http://www.lemoda.net/cgi/perl-unicode/index.html:

	use CGI;
	use Encode 'decode_utf8;
	my $value = params('input');
	$value = decode_utf8($value);

Decoding UTF-8 is done when filling %input_params hash and $path_info
variable; the former requires to move from explicit $cgi->param(<label>)
to $input_params{<name>} in a few places, which is a good idea anyway.

Also add -override=>1 parameter to $cgi->textfield() invocation in search
form.  Otherwise CGI would use values from query string if it is present,
filling value from $cgi->param... without decode_utf8().  As we are using
value of appropriate parameter anyway, -override=>1 doesn't change the
situation but makes gitweb fill search field correctly.

We could simply use the '-utf8' pragma (via "use CGI '-utf8';") to solve
this, but according to CGI.pm documentation, it may cause problems with
POST requests containing binary files, and it requires CGI 3.31 (I think),
released with perl v5.8.9.

Reported-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Jakub Narębski <jnareb@gmail.com>
Tested-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-03 13:03:08 -08:00
Jakub Narebski
abc0c9d2d7 gitweb: Make project search respect project_filter
Make gitweb search within filtered projects (i.e. projects shown), and
change "List all projects" to "List all projects in '$project_filter/'"
if project_filter is used.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:50 -08:00
Jakub Narebski
a1e1b2d77b gitweb: improve usability of projects search form
Refactor generating project search form into git_project_search_form().

Make text field wider and add on mouse over explanation (via "title"
attribute), add an option to use regular expressions, and replace
'Search:' label with [Search] button.

Also add "List all projects" link to make it easier to go back from search
result to list of all projects (note that an empty search term is
disallowed).

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:50 -08:00
Bernhard R. Link
4426ba2919 gitweb: place links to parent directories in page header
Change html page headers to not only link the project root and the
currently selected project but also the directories in between using
project_filter. (Allowing to jump to a list of all projects within
that intermediate directory directly and making the project_filter
feature visible to users).

Signed-off-by: Bernhard R. Link <brlink@debian.org>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:50 -08:00
Bernhard R. Link
40efa22309 gitweb: show active project_filter in project_list page header
In the page header of a project_list view with a project_filter
given show breadcrumbs in the page headers showing which directory
it is currently limited to and also containing links to the parent
directories.

Signed-off-by: Bernhard R. Link <brlink@debian.org>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:49 -08:00
Bernhard R. Link
56efd9d252 gitweb: limit links to alternate forms of project_list to active project_filter
If project_list action is given a project_filter argument, pass that to
TXT and OPML formats.

This way [OPML] and [TXT] links provide the same list of projects as
the projects_list page they are linked from.

Signed-off-by: Bernhard R. Link <brlink@debian.org>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:49 -08:00
Bernhard R. Link
19d2d23998 gitweb: add project_filter to limit project list to a subdirectory
This commit changes the project listing views (project_list,
project_index and opml) to limit the output to only projects in a
subdirectory if the new optional parameter ?pf=directory name is
used.

The implementation of the filter reuses the implementation used for
the 'forks' action (i.e. listing all projects within that directory
from the projects list file (GITWEB_LIST) or only projects in the
given subdirectory of the project root directory without a projects
list file).

Reusing $project instead of adding a new parameter would have been
nicer from a UI point-of-view (including PATH_INFO support) but
would complicate the $project validating code that is currently
being used to ensure nothing is exported that should not be viewable.

Signed-off-by: Bernhard R. Link <brlink@debian.org>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:49 -08:00
Bernhard R. Link
348a6589e0 gitweb: prepare git_get_projects_list for use outside 'forks'.
Use of the filter option of git_get_projects_list is currently limited
to forks. It currently assumes the project belonging to the filter
directory was already validated to be visible in the project list.

To make it more generic add an optional argument to denote visibility
verification is still needed.

If there is a projects list file (GITWEB_LIST) only projects from
this list are returned anyway, so no more checks needed.

If there is no projects list file and the caller requests strict
checking (GITWEB_STRICT_EXPORT), do not jump directly to the
given directory but instead do a normal search and filter the
results instead.

The only effect of GITWEB_STRICT_EXPORT without GITWEB_LIST is to make
sure no project can be viewed without also be found starting from
project root. git_get_projects_list without this patch does not enforce
this but all callers only call it with a filter already checked this
way. With this parameter a caller can request this check if the filter
cannot be checked this way.

Signed-off-by: Bernhard R. Link <brlink@debian.org>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:49 -08:00
Bernhard R. Link
4c7cd17714 gitweb: move hard coded .git suffix out of git_get_projects_list
Use of the filter option of git_get_projects_list is currently
limited to forks. It hard codes removal of ".git" suffixes from
the filter.

To make it more generic move the .git suffix removal to the callers.

Signed-off-by: Bernhard R. Link <brlink@debian.org>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-01 12:52:33 -08:00
Junio C Hamano
77cdf0f802 Merge branch 'jn/gitweb-unspecified-action'
* jn/gitweb-unspecified-action:
  gitweb: Fix actionless dispatch for non-existent objects
2012-01-29 13:18:50 -08:00
Junio C Hamano
b63103e908 Merge branch 'jn/maint-gitweb-grep-fix'
* jn/maint-gitweb-grep-fix:
  gitweb: Harden "grep" search against filenames with ':'
  gitweb: Fix file links in "grep" search
2012-01-16 16:45:56 -08:00
Junio C Hamano
242ff87975 Merge branch 'mm/maint-gitweb-project-maxdepth'
* mm/maint-gitweb-project-maxdepth:
  gitweb: accept trailing "/" in $project_list
2012-01-09 15:58:30 -08:00
Jakub Narebski
18ab83e856 gitweb: Fix actionless dispatch for non-existent objects
When gitweb URL does not provide action explicitly, e.g.

  http://git.example.org/repo.git/branch

dispatch() tries to guess action (view to be used) based on remaining
parameters.  Among others it is based on the type of requested object,
which gave problems when asking for non-existent branch or file (for
example misspelt name).

Now undefined $action from dispatch() should not result in problems.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-09 13:30:59 -08:00
Jakub Narebski
8e09fd1a1e gitweb: Harden "grep" search against filenames with ':'
Run "git grep" in "grep" search with '-z' option, to be able to parse
response also for files with filename containing ':' character.  The
':' character is otherwise (without '-z') used to separate filename
from line number and from matched line.

Note that this does not protect files with filename containing
embedded newline.  This would be hard but doable for text files, and
harder or even currently impossible with binary files: git does not
quote filename in

  "Binary file <foo> matches"

message, but new `--break` and/or `--header` options to git-grep could
help here.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-05 13:29:51 -08:00
Jakub Narebski
ff7f2185d6 gitweb: Fix file links in "grep" search
There were two bugs in generating file links (links to "blob" view),
one hidden by the other.  The correct way of generating file link is

	href(action=>"blob", hash_base=>$co{'id'},
	     file_name=>$file);

It was $co{'hash'} (this key does not exist, and therefore this is
undef), and 'hash' instead of 'hash_base'.

To have this fix applied in single place, this commit also reduces
code duplication by saving file link (which is used for line links) in
$file_href.

Reported-by: Thomas Perl <th.perl@gmail.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-05 13:29:50 -08:00
Matthieu Moy
ac593b76dd gitweb: accept trailing "/" in $project_list
The current code is removing the trailing "/", but computing the string
length on the previous value, i.e. with the trailing "/". Later in the
code, we do

  my $path = substr($File::Find::name, $pfxlen + 1);

And the "$pfxlen + 1" is supposed to mean "the length of the prefix, plus
1 for the / separating the prefix and the path", but with an incorrect
$pfxlen, this basically eats the first character of the path, and yields
"404 - No projects found".

While we're there, also fix $pfxdepth to use $dir, although a change of 1
in the depth shouldn't really matter.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-04 09:54:06 -08:00
Junio C Hamano
2b380d8191 Merge branch 'jn/maint-gitweb-utf8-fix'
* jn/maint-gitweb-utf8-fix:
  gitweb: Fix fallback mode of to_utf8 subroutine
  gitweb: Output valid utf8 in git_blame_common('data')
  gitweb: esc_html() site name for title in OPML
  gitweb: Call to_utf8() on input string in chop_and_escape_str()
2011-12-22 15:30:12 -08:00
Jakub Narebski
b13e3eacef gitweb: Fix fallback mode of to_utf8 subroutine
e5d3de5 (gitweb: use Perl built-in utf8 function for UTF-8 decoding.,
2007-12-04) was meant to make gitweb faster by using Perl's internals
(see subsection "Messing with Perl's Internals" in Encode(3pm) manpage)

Simple benchmark confirms that (old = 00f429a, new = this version):

        old  new
  old    -- -65%
  new  189%   --

Unfortunately it made fallback mode of to_utf8 do not work...  except
for default value 'latin1' of $fallback_encoding ('latin1' is Perl
native encoding), which is why it was not noticed for such long time.

utf8::valid(STRING) is an internal function that tests whether STRING
is in a _consistent state_ regarding UTF-8.  It returns true is
well-formed UTF-8 and has the UTF-8 flag on _*or*_ if string is held
as bytes (both these states are 'consistent').  For gitweb the second
option was true, as output from git commands is opened without ':utf8'
layer.

What made it work at all for STRING in 'latin1' encoding is the fact
that utf8:decode(STRING) turns on UTF-8 flag only if source string is
valid UTF-8 and contains multi-byte UTF-8 characters... and that if
string doesn't have UTF-8 flag set it is treated as in native Perl
encoding, i.e.  'latin1' / 'iso-8859-1' (unless native encoding it is
EBCDIC ;-)).  It was ':utf8' layer that actually converted 'latin1'
(no UTF-8 flag == native == 'latin1) to 'utf8'.

Let's make use of the fact that utf8:decode(STRING) returns false if
STRING is invalid as UTF-8 to check whether to enable fallback mode.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-19 12:25:43 -08:00
Jürgen Kreileder
57cf4ad6e8 gitweb: Output valid utf8 in git_blame_common('data')
Otherwise when javascript-actions are enabled gitweb shown broken
author names in the tooltips on blame pages ('blame_incremental'
view).

Signed-off-by: Jürgen Kreileder <jk@blackdown.de>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-18 19:44:52 -08:00
Jürgen Kreileder
5d7910569b gitweb: esc_html() site name for title in OPML
This escapes the site name in OPML (XML uses the same escaping rules
as HTML).  Also fixes encoding issues because esc_html() uses
to_utf8().

Signed-off-by: Jürgen Kreileder <jk@blackdown.de>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-18 19:44:51 -08:00
Jürgen Kreileder
168c1e0120 gitweb: Call to_utf8() on input string in chop_and_escape_str()
a) To fix the comparison with the chopped string,
   otherwise we compare bytes with characters, as
   chop_str() must run to_utf8() for correct operation
b) To give the title attribute correct encoding;
   we need to mark strings as UTF-8 before outpur

Signed-off-by: Jürgen Kreileder <jk@blackdown.de>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-18 19:44:51 -08:00
Kato Kazuyoshi
6ae683c0f4 gitweb: Add navigation to select side-by-side diff
Add to the lower part of navigation bar (the action specific part)
links allowing to switch between 'inline' (ordinary) diff and
'side by side' style diff.

It is not shown for combined / compact combined diff.

Signed-off-by: Kato Kazuyoshi <kato.kazuyoshi@gmail.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-31 15:22:58 -07:00
Jakub Narebski
d0e6e29ee6 gitweb: Use href(-replay=>1,...) for formats links in "commitdiff"
Use href(-replay->1,...) in (sub)navigation links (like changing style
of view, or going to parent commit) so that extra options are
preserved.

This is needed so clicking on such (sub)navigation link would preserve
style of diff; for example when using "side-by-side" diff style then
going to parent commit would now also use this style.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-31 15:22:58 -07:00
Jakub Narebski
970fac5e24 gitweb: Give side-by-side diff extra CSS styling
Use separate background colors for pure removal, pure addition and
change for side-by-side diff.  This makes reading such diff easier,
allowing to easily distinguish empty lines in diff from vertical
whitespace used to align chunk blocks.

Note that if lines in diff were numbered, the absence of line numbers
[for one side] would help in distinguishing empty lines from vertical
align.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-31 15:22:57 -07:00
Kato Kazuyoshi
6ba1eb51b9 gitweb: Add a feature to show side-by-side diff
This commits adds to support for showing "side-by-side" style diff.
Currently you have to hand-craft the URL; navigation for selecting
diff style is to be added in the next commit.

The diff output in unified format from "git diff-tree" is reorganized to
side-by-side style chunk by chunk with format_sidebyside_diff_chunk().
This reorganization requires knowledge about diff line classification,
so format_diff_line() was renamed to process_diff_line(), and changed to
return tuple (list) consisting of class of diff line and of
HTML-formatted (but not wrapped in <div class="diff ...">...</div>) diff
line.  Wrapping is now done by caller, i.e. git_patchset_body().

Gitweb uses float+margin CSS-based layout for "side by side" diff.

You can specify style of diff with "ds" ('diff_style') query
parameter.  Currently supported values are 'inline' and 'sidebyside';
the default is 'inline'.

Another solution would be to use "opt" ('extra_options') for that...
though current use of it in gitweb seems to suggest that "opt" is more
about passing extra options to underlying git commands, and "git diff"
doesn't support '--side-by-side' like GNU diff does, (yet?).

Signed-off-by: Kato Kazuyoshi <kato.kazuyoshi@gmail.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-31 15:22:56 -07:00
Jakub Narebski
f1310cf5e7 gitweb: Extract formatting of diff chunk header
Refactor main parts of HTML-formatting for diff chunk headers
(formatting means here adding links and syntax hightlighting) into
separate subroutines:

 * format_unidiff_chunk_header for ordinary diff,
 * format_cc_diff_chunk_header for combined diff
   (more than one parent)

This makes format_diff_line() subroutine easier to follow.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-31 15:22:55 -07:00
Jakub Narebski
20a864cd83 gitweb: Refactor diff body line classification
Simplify classification of diff line body in format_diff_line(),
replacing two long if-elsif chains (one for ordinary diff and one for
combined diff of a merge commit) with a single regexp match.  Refactor
this code into diff_line_class() function.

While at it:

* Fix an artifact in that $diff_class included leading space to be
  able to compose classes like this "class=\"diff$diff_class\"', even
  when $diff_class was an empty string.  This made code unnecessary
  ugly: $diff_class is now just class name or an empty string.

* Introduce "ctx" class for context lines ($diff_class was set to ""
  in this case before this commit).

Idea and initial code by Junio C Hamano, polish and testing by Jakub
Narebski.  Inspired by patch adding side-by-side diff by Kato Kazuyoshi,
which required $diff_class to be name of class without extra space.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-31 15:22:55 -07:00
Junio C Hamano
208a1cc3d3 Merge branch 'lh/gitweb-site-html-head'
* lh/gitweb-site-html-head:
  gitweb: provide a way to customize html headers
2011-10-26 16:16:31 -07:00
Junio C Hamano
aface4c390 Merge branch 'jm/maint-gitweb-filter-forks-fix'
* jm/maint-gitweb-filter-forks-fix:
  gitweb: fix regression when filtering out forks
2011-10-26 16:16:30 -07:00
Julien Muchembled
53c632faab gitweb: fix regression when filtering out forks
This fixes a condition in filter_forks_from_projects_list that failed if
process directory was different from project root: in such case, the subroutine
was a no-op and forks were not detected.

Signed-off-by: Julien Muchembled <jm@jmuchemb.eu>
Tested-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-21 14:46:38 -07:00
Lénaïc Huard
c1355b7ffb gitweb: provide a way to customize html headers
This allows web sites to add some specific html headers to the pages
generated by gitweb.

The new variable $site_html_head_string can be set to an html snippet that
will be inserted at the end of the <head> section of each page generated
by gitweb.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr.eu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-21 10:18:37 -07:00
Jakub Narebski
0866786b80 gitweb: Strip non-printable characters from syntax highlighter output
The current code, as is, passes control characters, such as form-feed
(^L) to highlight which then passes it through to the browser.  User
agents (web browsers) that support 'application/xhtml+xml' usually
require that web pages declared as XHTML and with this mimetype are
well-formed XML.  Unescaped control characters cannot appear within a
contents of a valid XML document.

This will cause the browser to display one of the following warnings:

* Safari v5.1 (6534.50) & Google Chrome v13.0.782.112:

   This page contains the following errors:

   error on line 657 at column 38: PCDATA invalid Char value 12
   Below is a rendering of the page up to the first error.

* Mozilla Firefox 3.6.19 & Mozilla Firefox 5.0:

   XML Parsing Error: not well-formed
   Location:
   http://path/to/git/repo/blah/blah

Both errors were generated by gitweb.perl v1.7.3.4 w/ highlight 2.7
using arch/ia64/kernel/unwind.c from the Linux kernel.

When syntax highlighter is not used, control characters are replaced
by esc_html(), but with syntax highlighter they were passed through to
browser (to_utf8() doesn't remove control characters).

Introduce sanitize() subroutine which strips forbidden characters, but
does not perform HTML escaping, and use it in git_blob() to sanitize
syntax highlighter output for XHTML.

Note that excluding "\t" (U+0009), "\n" (U+000A) and "\r" (U+000D) is
not strictly necessary, atleast for currently the only callsite: "\t"
tabs are replaced by spaces by untabify(), "\n" is stripped from each
line before processing it, and replacing "\r" could be considered
improvement.

Originally-by: Christopher M. Fuhrman <cfuhrman@panix.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-16 09:22:47 -07:00
Junio C Hamano
5329c99795 Merge branch 'jn/mime-type-with-params' into maint
* jn/mime-type-with-params:
  gitweb: Serve */*+xml 'blob_plain' as text/plain with $prevent_xss
  gitweb: Serve text/* 'blob_plain' as text/plain with $prevent_xss
2011-08-16 11:41:26 -07:00
Junio C Hamano
2728139a62 Merge branch 'jn/gitweb-config-list-case'
* jn/gitweb-config-list-case:
  gitweb: Git config keys are case insensitive, make config search too
2011-08-08 12:33:35 -07:00
Junio C Hamano
86bd7f9989 Merge branch 'jn/gitweb-system-config'
* jn/gitweb-system-config:
  gitweb: Introduce common system-wide settings for convenience
2011-08-08 12:33:34 -07:00
张忠山
927cd1fc94 gitweb: pass string after encoding in utf-8 to syntax highlighter
Otherwise the highlight filter would work on a corrupt byte sequence.

Signed-off-by: 张忠山 <zzs213@126.com>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 12:13:38 -07:00
Jakub Narebski
14569cd810 gitweb: Git config keys are case insensitive, make config search too
"git config -z -l" that gitweb uses in git_parse_project_config() to
populate %config hash returns section and key names of config
variables in lowercase (they are case insensitive).  When checking
%config in git_get_project_config() we have to take it into account.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-31 18:39:19 -07:00
Jakub Narebski
131d6afcba gitweb: Introduce common system-wide settings for convenience
Because of backward compatibility we cannot change gitweb to always
use /etc/gitweb.conf (i.e. even if gitweb_config.perl exists).  For
common system-wide settings we therefore need separate configuration
file: /etc/gitweb-common.conf.

Long description:

gitweb currently obtains configuration from the following sources:

  1. per-instance configuration file (default: gitweb_conf.perl)
  2. system-wide configuration file (default: /etc/gitweb.conf)

If per-instance configuration file exists, then system-wide
configuration is _not used at all_.  This is quite untypical and
suprising behavior.

Moreover it is different from way git itself treats /etc/git.conf.  It
reads in stuff from /etc/git.conf and then local repos can change or
override things as needed.  In fact this is quite beneficial, because
it gives site admins a simple and easy way to give an automatic hint
to a repo about things the admin would like.

On the other hand changing current behavior may lead to the situation,
where something in /etc/gitweb.conf may interfere with unintended
interaction in the local repository.  One solution would be to
_require_ to do explicit include; with read_config_file() it is now
easy, as described in gitweb/README (description introduced in this
commit).

But as J.H. noticed we cannot ask people to modify their per-instance
gitweb config file to include system-wide settings, nor we can require
them to do this.

Therefore, as proposed by Junio, for gitweb to have centralized config
elements while retaining backwards compatibility, introduce separate
common system-wide configuration file, by default /etc/gitweb-common.conf

Noticed-by: Drew Northup <drew.northup@maine.edu>
Helped-by: John 'Warthog9' Hawley <warthog9@kernel.org>
Inspired-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-24 16:22:21 -07:00
Junio C Hamano
ba9a247bf6 Merge branch 'jn/gitweb-search'
* jn/gitweb-search:
  gitweb: Make git_search_* subroutines render whole pages
  gitweb: Clean up code in git_search_* subroutines
  gitweb: Split body of git_search into subroutines
  gitweb: Check permissions first in git_search
2011-07-22 14:25:19 -07:00
Junio C Hamano
54dbc1f9e6 Merge branch 'jn/mime-type-with-params'
* jn/mime-type-with-params:
  gitweb: Serve */*+xml 'blob_plain' as text/plain with $prevent_xss
  gitweb: Serve text/* 'blob_plain' as text/plain with $prevent_xss
2011-07-19 09:45:41 -07:00
Junio C Hamano
17a403c8ce Merge branch 'jn/gitweb-split-header-html'
* jn/gitweb-split-header-html:
  gitweb: Refactor git_header_html
2011-07-19 09:45:28 -07:00
Junio C Hamano
d4c8c55fab Merge branch 'ln/gitweb-mime-types-split-at-blank'
* ln/gitweb-mime-types-split-at-blank:
  gitweb: allow space as delimiter in mime.types
2011-07-13 14:31:36 -07:00
Jakub Narebski
e8c3531717 gitweb: Serve */*+xml 'blob_plain' as text/plain with $prevent_xss
Enhance usability of 'blob_plain' view protection against XSS attacks
(enabled by setting $prevent_xss to true) by serving contents inline
as safe 'text/plain' mimetype where possible, instead of serving with
"Content-Disposition: attachment" to make sure they don't run in
gitweb's security domain.

This patch broadens downgrading to 'text/plain' further, to any
*/*+xml mimetype.  This includes:

  application/xhtml+xml    (*.xhtml, *.xht)
  application/atom+xml     (*.atom)
  application/rss+xml      (*.rss)
  application/mathml+xm    (*.mathml)
  application/docbook+xml  (*.docbook)
  image/svg+xml            (*.svg, *.svgz)

Probably most useful is serving XHTML files as text/plain in
'blob_plain' view, directly viewable.

Because file with 'image/svg+xml' mimetype can be compressed SVGZ
file, we have to check if */*+xml really is text file, via '-T $fd'.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-30 11:26:48 -07:00
Jakub Narebski
86afbd02c8 gitweb: Serve text/* 'blob_plain' as text/plain with $prevent_xss
One of mechanism enabled by setting $prevent_xss to true is 'blob_plain'
view protection.  With XSS prevention on, blobs of all types except a
few known safe ones are served with "Content-Disposition: attachment" to
make sure they don't run in our security domain.

Instead of serving text/* type files, except text/plain (and including
text/html), as attachements, downgrade it to text/plain.  This way HTML
pages in 'blob_plain' (raw) view would be displayed in browser, but
safely as a source, and not asked to be saved.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-30 11:26:39 -07:00
Jakub Narebski
6ee9033d67 gitweb: Refactor git_header_html
Extract the following parts into separate subroutines:

 * finding correct MIME content type for HTML pages (text/html or
   application/xhtml+xml?) into get_content_type_html()
 * printing <link ...> elements in HTML head into print_header_links()
 * printing navigation "breadcrumbs" for given action into
   print_nav_breadcrumbs()
 * printing search form into print_search_form()

This reduces git_header_html to two pages long (53 lines), making gitweb
code easier to read.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 14:04:32 -07:00