Ideally, we'd process them in numeric order since that is more
logical, but we can't do that yet since this is where we find the
numeric identifiers in the first place. Lexicographic order is a good
compromise.
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we specify a list of namespaces to fetch from, by default the MW
API will not fetch from the default namespace, refered to as "(Main)"
in the documentation:
https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces
I haven't found a way to address that "(Main)" namespace when getting
the namespace ids: indeed, when listing namespaces, there is no
"canonical" field for the main namespace, although there is a "*"
field that is set to "" (empty). So in theory, we could specify the
empty namespace to get the main namespace, but that would make
specifying namespaces harder for the user: we would need to teach
users about the "empty" default namespace. It would also make the code
more complicated: we'd need to parse quotes in the configuration.
So we simply override the query here and allow the user to specify
"(Main)" since that is the publicly documented name.
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Virtual namespaces do not correspond to pages in the database and are
automatically generated by MediaWiki. It makes little sense,
therefore, to fetch pages from those namespaces and the MW API doesn't
support listing those pages.
According to the documentation, those virtual namespaces are currently
"Special" (-1) and "Media" (-2) but we treat all negative namespaces
as "virtual" as a future-proofing mechanism.
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we fail to find a requested namespace, we should tell the user
which ones we know about, since those were already fetched. This
allows users to fetch all namespaces by specifying a dummy namespace,
failing, then copying the list of namespaces in the config.
Eventually, we should have a flag that allows fetching all namespaces
automatically.
Reviewed-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
we still want to use spaces as separators in the config, but we should
allow the user to specify namespaces with spaces, so we use underscore
for this.
Reviewed-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces a new remote.origin.namespaces argument that is a
space-separated list of namespaces. The list of pages extract is then
taken from all the specified namespaces.
Reviewed-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Antoine Beaupré <anarcat@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a media file contains valid UTF-8, git-remote-mediawiki tried to be
too clever about the encoding, and the call to utf8::downgrade() on the
downloaded content was failing with
Wide character in subroutine entry at git-remote-mediawiki line 583.
Instead, use $response->decode() to apply decoding linked to the
Content-Encoding: header, and return the content without attempting any
charset decoding.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mediawiki introduces a new API for queries w/ more than 500 results in
version 1.21. That change triggered an infinite loop while cloning a
mediawiki with such a page.
The latest API renamed and moved the "continuing" information in the
response, necessary to build the next query. The code failed to retrieve
that information but still detected that it was in a "continuing
query". As a result, it launched the same query over and over again.
If a "continuing" information is detected in the response (old or new),
the next query is updated accordingly. If not, we quit assuming it's not
a continuing query.
Reported-by: Benjamin Cathey
Signed-off-by: Benoit Person <benoit.person@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
We used to update the private ref ourselves, but this update is now
done by default since 664059fb (transport-helper: update remote
helper namespace, 2013-04-17).
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a command to allow previewing the contents locally before
pushing it out, when working with a MediaWiki remote.
I personally do not think this belongs to Git. If you are working
on a set of AsciiDoc source files, you sure do want to locally
format to preview what you will be pushing out, and if you are
working on a set of C or Java source files, you do want to test it
before pushing it out, too. That kind of thing belongs to your
build script, not to your SCM.
But I'll let it pass, as this is only a contrib/ thing.
* bp/mediawiki-preview:
git-remote-mediawiki: add preview subcommand into git mw
git-remote-mediawiki: add git-mw command
git-remote-mediawiki: factoring code between git-remote-mediawiki and Git::Mediawiki
git-remote-mediawiki: update tests to run with the new bin-wrapper
git-remote-mediawiki: add a git bin-wrapper for developement
wrap-for-bin: make bin-wrappers chainable
git-remote-mediawiki: introduction of Git::Mediawiki.pm
For now, Git::Mediawiki contains nothing.
This first patch moves some of git-remote-mediawiki.perl's factorisable code
into Git::Mediawiki. In the same time, it removes the side effects of that code
and renames the fucntions and constants moved to expose a better API.
Signed-off-by: Benoit Person <benoit.person@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e83d36b66f turned "print STDOUT" into "print {*STDOUT}", as
suggested by perlcritic. Unfortunately, it also changed two "binmode
STDOUT" calls the same way, which does not work and yield a "Not a GLOB
reference" error.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* cm/remote-mediawiki-perlcritique: (31 commits)
git-remote-mediawiki: make error message more precise
git-remote-mediawiki: add a perlcritic rule in Makefile
git-remote-mediawiki: add a .perlcriticrc file
git-remote-mediawiki: clearly rewrite double dereference
git-remote-mediawiki: fix a typo ("mediwiki" instead of "mediawiki")
git-remote-mediawiki: put non-trivial numeric values in constants.
git-remote-mediawiki: don't use quotes for empty strings
git-remote-mediawiki: replace "unless" statements with negated "if" statements
git-remote-mediawiki: brace file handles for print for more clarity
git-remote-mediawiki: modify strings for a better coding-style
git-remote-mediawiki: put long code into a subroutine
git-remote-mediawiki: remove import of unused open2
git-remote-mediawiki: check return value of open
git-remote-mediawiki: assign a variable as undef and make proper indentation
git-remote-mediawiki: rename a variable ($last) which has the name of a keyword
git-remote-mediawiki: remove unused variable $entry
git-remote-mediawiki: turn double-negated expressions into simple expressions
git-remote-mediawiki: change the name of a variable
git-remote-mediawiki: add newline in the end of die() error messages
git-remote-mediawiki: change style in a regexp
...
In subroutine parse_command, error messages were not correct. For the "import"
function, having too much or incorrect arguments displayed both
"invalid arguments", while it displayed "too many arguments" for the "option"
functions under the same conditions.
Separate the two error messages in both cases.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@$var structures are re-written in the following way: @{$var}
It makes them more readable.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-trivial numeric values (e.g., different from 0, 1 and 2) are placed in
constants at the top of the code to be easily modifiable and to make more sense
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Empty strings are replaced by an $EMPTY constant.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This follows the following rule:
InputOutput::RequireBracedFileHandleWithPrint (Severity: 1)
The `print' and `printf' functions have a unique syntax that supports an
optional file handle argument. Conway suggests wrapping this argument in
braces to make it visually stand out from the other arguments. When you
put braces around any of the special package-level file handles like
`STDOUT', `STDERR', and `DATA', you must the `'*'' sigil or else it
won't compile under `use strict 'subs''.
print $FH "Mary had a little lamb\n"; #not ok
print {$FH} "Mary had a little lamb\n"; #ok
print STDERR $foo, $bar, $baz; #not ok
print {STDERR} $foo, $bar, $baz; #won't compile under 'strict'
print {*STDERR} $foo, $bar, $baz; #perfect!
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- strings which don't need interpolation are single-quoted for more clarity and
slight gain of performance
- interpolation is preferred over concatenation in many cases, for more clarity
- variables are always used with the ${} operator inside strings
- strings including double-quotes are written with qq() so that the quotes do
not have to be escaped
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly assign local variable $/ as undef and make a proper
one-instruction-by-line indentation
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Local variable $url has the same name as a global variable. Changing the name
of the local variable prevents future possible misunderstanding.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this regexp, ' |\n' is used, whereas its equivalent '[ \n]', which is
clearer, is used elsewhere. Make the style coherent.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use {}{} instead of /// when slashes are used inside the regexp so as not to
escape it.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A "split ' '" is turned into a "split / /", which changes its behaviour: the
old method matched a run of whitespaces (/\s*/), while the new one will match a
single space, which is what we want here. Indeed, in other contexts,
changing split(' ') to split(/ /) could potentially be a regression, however,
here, when parsing the output of "rev-list --parents", whose output SHA-1's are
each separated by a single space, splitting on a single space is perfectly
correct.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
m// and // is used randomly. It is better to use the m modifier only when
needed, e.g., when the regexp uses another separator than //.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subroutines' parameters should be assigned to variable before doing anything
else
Besides, existing instruction affected a variable inside a "if", which break
Git's coding style
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Put first parameter of map inside a block, for better readability.
Follow BuiltinFunctions::RequireBlockMap
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
%basetimestamps declaration was lost in the middle of subroutines
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perl's split function takes a regex pattern argument. You can also
feed it an expression, which is then compiled into a regex at runtime.
It therefore works to pass your pattern via single quotes, but it is
much less obvious to a reader that the argument is meant to be a
regex, not a static string. Using the traditional slash-delimiters
makes this easier to read.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bridge to MediaWiki has been updated to use the credential
helper interface in Git.pm, losing its own and the original
implementation the former was based on.
* bp/mediawiki-credential:
git-remote-mediawiki: use Git.pm functions for credentials
Users may be confused when they run the perl script directly.
A good way to detect this is to check the number of parameters used to call the
script, which is never different from 2 in a normal use.
Display a proper error message to avoid any confusion.
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>