See also: ParserMarket, look for the MediaWiki parser (ParserMarket/Media4Moin)

MediaWiki to MoinMoin Wiki converter

It is a php page that query current versions of a MediaWiki pages from database and store it to directory structure compatible with MoinMoin wiki. Scroll to the end for a very alpha python version.

To use - download one of existent here attachments, write into them MediaWiki database login info, and then just run it.

ToDo

ChangeLog

Branched 0.3

I had some problems with the 0.2 script but built on it for a better version. The script creates a folder called mediawiki_pages and inserts the folders in there. My code is shoddy but should work better than 0.2.

The Code

mediawiki2moin-v0.3.php.txt

Added Features

0.2

0.1

Code v0.1

mediawiki2moin.php.gz

Quote directory name

   1 UNSAFE = re.compile(r'[^a-zA-Z0-9_]+')
   2 
   3 wikiname = wikiname.replace(u' ', u'_') # " " -> "_"
   4 filename = wikiname.encode(charset)
   5 
   6 quoted = []
   7 location = 0
   8 for needle in UNSAFE.finditer(filename):
   9     # append leading safe stuff
  10     quoted.append(filename[location:needle.start()])
  11     location = needle.end()
  12     # Quote and append unsafe stuff
  13     quoted.append('(')
  14     for character in needle.group():
  15         quoted.append('%02x' % ord(character))
  16     quoted.append(')')
  17 
  18 # append rest of string
  19 quoted.append(filename[location:])
  20 return ''.join(quoted)

$pagename = "öüä";
$pagename = utf8_encode(str_replace(" ", "_", $pagename));
$quoted = array();
$in_parenthesis = 0;

for($i = 0; $i < strlen($pagename); $i++) {
    $curchar = substr ($pagename, $i, 1);
    if (ereg('[^a-zA-Z0-9_]', $curchar)) {
        if (!$in_parenthesis)
            $quoted[] = '('
        $in_parenthesis = 1;
        $quoted[] = str_pad(dechex(ord($curchar)), 2, '0', STR_PAD_LEFT);
    } else {
        if ($in_parenthesis)
            $quoted[] = )';
        $quoted[] = $curchar;
    }
}

$pagename_new = implode('', $quoted);

Possible Update

Code can be found at mw2moin.php.txt or the mirror.

After a bit of hacking I refactored the code and added the following features

The only side effect I've observed is on large sites that you've set to import revision history and change the syntax at the same time you'll likely hit your php memory limit, but all the code needs some tidying up along those lines. I've also added some header comments to clarify just what exactly is going on, and explaining a bit about configuration. The only other thing I could think of to add was a routine to import users, but the version of MediaWiki I had authed against md5 while MoinMoin prefixed with {SHA}. I thought about a couple options:

  1. Import users with new random passwords and force them to mail password and reset
  2. Make a patch for MoinMoin to detect user encryption based on {KEY} and possibly have a forced password upgrade option.

My intent was to create a more featureful converter so as to hopefully inspire others to make more featureful parsers (if only we had table support and image thumbnails) and the like to aid large sites in migration.

Another possible update

The versions above don't work with MediaWiki versions newer than 0.4 due to the changes in database layout (the 'cur' table is gone). Here is a version that seems to work with 0.11 and MoinMoin 0.5: mw_11_2_moin.php.txt

Python converter

What we really need is a converter from current mediawiki to current moin.

To be of any long term usefulness, the converter should have some specific properties:

Before starting:

Here's mine

I converted the latest PHP script above to Python, and removed some logic that was duplicated from Moin replacing it with calls to Moin functions. Also it supports multiple database backends in principle, although I only wrote a sqlite backend. It has all of 7 lines, so it should be trivial to write others. Please email them to me or publish them in a branch if you do. (Or if you make fixes/improvements. Or write docs.)

https://launchpad.net/mw2moin

The calls into the Moin API are written for Moin 1.5, because that's the latest version where the MediaWiki parser works. If you want to migrate the parser to 1.8/1.9, I'd be happy to upgrade this converter too.

There's no documentation in the sources yet, not even a license file or a README. Just edit the config (ideally, copy defaultconfig.py to localconfig.py and edit that one) then invoke the script.

Nice to see some python code. AFAIK, currently noone works on updating that parser. It depends a bit on the effort needed whether it makes sense to update it for 1.9 as we already work on moin 2.0 and there everything is rather different (btw, we want a mediawiki converter there!). -- ThomasWaldmann 2010-02-15 21:24:53

Another take

I wrote a quick and dirty MediaWiki-to-Moin converter too, this time one that supports PostgreSQL. This is pretty crucial, because MW's PostgreSQL backend actually names the tables differently than the MySQL backend. This script's behaviour can be configured on per-namespace basis; for example, you can tell it to dump Talk: pages to "%s/Discussion" (%s gets replaced by page title) and MyFunnyCustomNamespace: pages to "My Funny Custom Namespace/%s". It can also use a custom mapping of MediaWiki users to Moin users. It preserves history, but makes no attempt at understanding the markup and just tags pages as "#format mediawiki". I originally intended it to use the MoinMoin API, but it actually just dumps stuff into a directory in storage format; you can just copy the directories over to data/pages. It attempts to produce a global edit log, but it's not sorted. Also, I'm not much of a Python guy so this is pretty ugly. =) -- UrpoLankinen 2011-03-22 10:20:15

MoinMoin: MediaWikiConverter (last edited 2011-05-22 14:40:11 by PaulBoddie)