Traditionnally, pages named Foo:Bar are page 'Bar' in namespace 'Foo'.
However, it is also possible to call a page Foo:Bar if 'Foo' is not a
namespace. In this case, the actual name of the page is 'Foo:Bar', in the
main namespace. Since we can't tell with only the filename, query the
wiki for a namespace 'Foo' in these cases, but deal with the case where
no such namespace is found.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some wiki, including https://git.wiki.kernel.org/ have invalid revision
numbers (i.e. the actual revision numbers are non-contiguous). Don't die
when encountering one.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Initial phases of push and pull with git-remote-mediawiki can be long on
a large wiki. Let the user know what's going on.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When notes are created to record a push, it normally doesn't exist yet.
However, when a push is interrupted and then restarted, it may happen
that a commit already has notes attached, and we want to reflect the newly
created remote revision, hence use 'git notes add -f' to override the
existing one
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The algorithm to find a path from the local revision to the remote one
was calling "git rev-list" and parsing its output N times. Run rev-list
only once, and fill a hashtable with the result to optimize the body of
the loop.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is possible to use git-remote-mediawiki on a tree with both .mw files
and other files. Before git-remote-mediawiki learnt how to export
mediafiles, such mixed trees allowed the user to maintain both the wiki
and other files for the same project in the same repository. With the
newly added support for exporting mediafiles, pushing such mixed trees
would upload unrelated files as mediafiles, which may not be desired.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mm/mediawiki-tests:
git-remote-mediawiki: be more defensive when requests fail
git-remote-mediawiki: more efficient 'pull' in the best case
git-remote-mediawiki: extract revision-importing loop to a function
git-remote-mediawiki: refactor loop over revision ids
git-remote-mediawiki: change return type of get_mw_pages
git-remote-mediawiki (t9363): test 'File:' import and export
git-remote-mediawiki: support for uploading file in test environment
git-remote-mediawiki (t9362): test git-remote-mediawiki with UTF8 characters
git-remote-mediawiki (t9361): test git-remote-mediawiki pull and push
git-remote-mediawiki (t9360): test git-remote-mediawiki clone
git-remote-mediawiki: test environment of git-remote-mediawiki
git-remote-mediawiki: scripts to install, delete and clear a MediaWiki
"mediawiki" remote helper (in contrib/) learned to handle file
attachments.
* mm/mediawiki-file-attachments:
git-remote-mediawiki: improve support for non-English Wikis
git-remote-mediawiki: import "File:" attachments
git-remote-mediawiki: split get_mw_pages into smaller functions
git-remote-mediawiki: send "File:" attachments to a remote wiki
git-remote-mediawiki: don't "use encoding 'utf8';"
git-remote-mediawiki: don't compute the diff when getting commit message
The only way to fetch new revisions from a wiki before this patch was to
query each page for new revisions. This is good when tracking a small set
of pages on a large wiki, but very inefficient when tracking many pages
on a wiki with little activity.
Implement a new strategy that queries the wiki for its last global
revision, queries each new revision, and filter out pages that are not
tracked.
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without changing the behavior, we turn the foreach loop on an array of
revisions into a loop on an array of integer. It will be easier to
implement other strategies as they will only need to produce an array of
integer instead of a more complex data-structure.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous version was returning the list of pages to be fetched, but
we are going to need an efficient membership test (i.e. is the page
$title tracked), hence exposing a hash will be more convenient.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Pavel Volek <Pavel.Volek@ensimag.imag.fr>
Signed-off-by: NGUYEN Kim Thuat <Kim-Thuat.Nguyen@ensimag.imag.fr>
Signed-off-by: ROUCHER IGLESIAS Javier <roucherj@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will be used for testing git-remote-mediawiki's import feature on a
wiki containing media files.
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-ascii encoding create many particular cases when used in page
content, name, and edit/commit message. Test these cases.
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch provides a set of tests for the pull and push fonctionnality
of git-remote-mediawiki. The actual tests are kept in a separate function
to allow further tests to re-run the same set of commands with different
push and pull strategies.
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to test git-remote-mediawiki, a set of functions is needed to
manage a MediaWiki: edit a page, remove a page, fetch a page, fetch all
pages on a given wiki.
A few helper function are also provided to check the content of
directories.
In addition, this patch provides Makefiles to execute tests.
See the README file for more details.
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
install_wiki.sh allows the user to install a MediaWiki instance in a
single shell command. Like "git instaweb", it configures and launches
lighttpd without requiring root priviledges. To simplify database
management, it uses SQLite, which doesn't require a running daemon, and
allows reseting the database by simply replacing a single file. This
allows install_wiki to also defines a function wiki_reset which clear all
content of the previously created wiki, which will be very useful to run
several indepenant tests on the same wiki.
Note those functionnalities are made to be used from the user command
line in the directory git/contrib/mw-to-git/t/
Signed-off-by: Simon CATHEBRAS <Simon.Cathebras@ensimag.imag.fr>
Signed-off-by: Julien KHAYAT <Julien.Khayat@ensimag.imag.fr>
Signed-off-by: Simon Perrat <simon.perrat@ensimag.imag.fr>
Signed-off-by: Charles ROUSSEL <Charles.Roussel@ensimag.imag.fr>
Signed-off-by: Guillaume SASDY <Guillaume.Sasdy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mediafiles can live in namespaces with names different from Image
and File. While at it, rework the code to make it simpler and easier
to read.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the symmetrical feature to the "File:" export support in the previous
patch. Download files from the wiki as needed, and feed them into the
fast-import stream. Import both the file itself, and the corresponding
description page.
Signed-off-by: Pavel Volek <Pavel.Volek@ensimag.imag.fr>
Signed-off-by: NGUYEN Kim Thuat <Kim-Thuat.Nguyen@ensimag.imag.fr>
Signed-off-by: ROUCHER IGLESIAS Javier <roucherj@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current version of the git-remote-mediawiki supports only import and
export of plain wiki pages. This patch adds the functionality to export
file attachments (i.e. the content of the File: MediaWiki namespace),
which are also exposed by MediaWiki API.
This requires a recent version of MediaWiki::API (Version 0.37 works.
Version 0.34 doesn't).
Signed-off-by: Pavel Volek <Pavel.Volek@ensimag.imag.fr>
Signed-off-by: NGUYEN Kim Thuat <Kim-Thuat.Nguyen@ensimag.imag.fr>
Signed-off-by: ROUCHER IGLESIAS Javier <roucherj@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of this statement is generally discouraged, and is too intrusive
for us: it forces the HTTP requests made by the API to contain only valid
UTF-8 characters. This would break the upload of binary files.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we're there, simplify the code a bit: since log --format=%s anyway
shows the subject line as a single line, no need to split to take the
first line.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous version implemented the possibility to log in a wiki, but
the username and password had to be provided as configuration variables.
We add the possibility to use the Git credential system to prompt
the password.
The support if implemented with generic functions that mimic the C API,
designed to be usable from other contexts in the future (i.e. they may
migrate to Git.pm if someone is interested).
While we're there, do a bit of refactoring in mw_connect_maybe.
Based on patch by: Javier Roucher Iglesias <Javier.Roucher-Iglesias@ensimag.imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On the MediaWiki side, the author information is just the MediaWiki login
of the contributor. The import turns it into login@$wiki_name to create
the author's email address on the wiki side. But we don't want this to
include the HTTP password if it's present in the URL ...
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the wiki uses e.g. LDAP for authentication, the web interface shows
a popup to allow the user to chose an authentication domain, and we need
to use lgdomain in the API at login time.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have a check that no new revisions are on the wiki at the
beginning of the push, but this didn't handle concurrent accesses to the
wiki.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a whitespace issue (no space before :) and remove unused %status in
mw_push.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Push can not set the commit note "mediawiki_revision:" and update the
remote reference. This avoids having to "git pull --rebase" after each
push, and is probably more natural. Make it the default, but let it be
configurable with mediawiki.dumbPush or remote.<remotename>.dumbPush.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement a gate between git and mediawiki, allowing git users to push
and pull objects from mediawiki just as one would do with a classic git
repository thanks to remote-helpers.
The following packages need to be installed (available on common
repositories):
libmediawiki-api-perl
libdatetime-format-iso8601-perl
Use remote helpers in order to be as transparent as possible to the git
user.
Download Mediawiki revisions through the Mediawiki API and then
fast-import into git.
Mediawiki revision number and git commits are linked thanks to notes
bound to commits.
The import part is done on a refs/mediawiki/<remote> branch before
coming to refs/remote/origin/master (Huge thanks to Jonathan Nieder
for his help)
We use UTF-8 everywhere: use encoding 'utf8'; does most of the job, but
we also read the output of Git commands in UTF-8 with the small helper
run_git, and write to the console (STDERR) in UTF-8. This allows a
seamless use of non-ascii characters in page titles, but hasn't been
tested on non-UTF-8 systems. In particular, UTF-8 encoding for filenames
could raise problems if different file systems handle UTF-8 filenames
differently. A uri_escape of mediawiki filenames could be imaginable, and
is still to be discussed further.
Partial cloning is supported using one of:
git clone -c remote.origin.pages='A_Page Another_Page' mediawiki::http://wikiurl
git clone -c remote.origin.categories='Some_Category' mediawiki::http://wikiurl
git clone -c remote.origin.shallow='True' mediawiki::http://wikiurl
Thanks to notes metadata, it is possible to compare remote and local last
mediawiki revision to warn non-fast forward pushes and "everything
up-to-date" case.
When allowed, push looks for each commit between remotes/origin/master
and HEAD, catches every blob related to these commit and push them in
chronological order. To do so, it uses git rev-list --children HEAD and
travels the tree from remotes/origin/master to HEAD through children. In
other words:
* Shortest path from remotes/origin/master to HEAD
* For each commit encountered, push blobs related to this commit
Signed-off-by: Jérémie Nikaes <jeremie.nikaes@ensimag.imag.fr>
Signed-off-by: Arnaud Lacurie <arnaud.lacurie@ensimag.imag.fr>
Signed-off-by: Claire Fousse <claire.fousse@ensimag.imag.fr>
Signed-off-by: David Amouyal <david.amouyal@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Sylvain Boulmé <sylvain.boulme@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>