Talk - MoinMoin Developer Introduction
Abstract
MoinMoin is a popular and powerful wiki engine in Python.
The talk will give an introduction to the MoinMoin core source code, extension concepts and extension development. We will use the current development branch (1.6) to do this, but most stuff also applies to 1.5 (production releases).
Shortly, we will present the flexibility of MoinMoin and how the user base (Python, Apache, Ubuntu, ...) looks like.
We will give an architecture overview as well as explain some code and show how you can write your own.
Introduction
MoinMoin is a wiki engine implemented in Python. We expect most of you know what a wiki is (or even how to use MoinMoin), so we will not do a wiki usage introduction here, rather a MoinMoin developer introduction.
It started as a one-man project by Jürgen Hermann, but is currently maintained by the MoinCoreTeamGroup.
We also often get contributions in form of patches, translations and documentation by our users.
MoinMoin is used by all sorts of organizations and individuals. User base grew significantly after introduction of internal unicode processing and using utf-8 as standard encoding and again after the introduction of the MoinMoin Desktop Edition (MMDE).
Some historical data
Date |
Version |
SLOC |
Comment |
2000-07 |
0.1 |
470 |
Jürgen Hermann's fork of Martin Pool's PikiPiki 1.62 with improvements |
2001-05 |
0.9 |
6800 |
|
2002-05 |
1.0 |
15300 |
team development started |
2003-11 |
1.1 |
23500 |
|
2004-02 |
1.2 |
14700 |
moved contrib stuff to wiki, removed some python libs we had included |
2004-12 |
1.3.0 |
20100 |
Introduced MoinMoin DesktopEdition |
2006-01 |
1.5.0 |
45000 |
|
2006-07 |
1.5.4 |
47000 |
latest released version |
2006-07 |
1.6dev |
47500 |
development branch: Lupy search replaced by Xapian, bigger refactorings |
Data was generated by sloccount
Some users
There are various groups using the MoinMoin wiki. For example:
Python uses MoinMoin on wiki.python.org, more than 1800 pages
The Apache Foundation uses MoinMoin for more than 50 wikis
Ubuntu is running nearly all sites (besides Launchpad etc.) on MoinMoin, even www.ubuntu.com
How development is done / Communication
On IRC (network Freenode), we have two channels:
#moin for user support (includes 3rd party plugin stuff) and general stuff
#moin-dev for core development (which is logged in the wiki)
For communication, documentation and support we use two wikis.
http://moinmoin.wikiwikiweb.de/ for general stuff: market pages with contributed stuff, bug reports, feature requests, patches, ideas, ...
http://moinmaster.wikiwikiweb.de/ for translators and writing of documentation.
- We have some gettext parser there checking the translations strings that can be edited online in the wiki.
Most of the MoinMaster pages get included in the distribution archive on release.
There is a mailinglist for users.
We use Mercurial (master repository on http://hg.moinmo.in/ ) for version control. Have a look at the talk "Achieving High Performance In Mercurial" held by Bryan O'Sullivan on Tuesday morning.
We mostly use PEP8 coding style. This results in mixed styles because PEP8 changed over the time and there were different people involved in development.
No, we don't use "hungarian notation".
Architecture Overview
MoinMoin distribution code is some (more or less modular) core code plus a set of (built-in) plugins.
Plugins
MoinMoin uses a lot of plugin types:
- action
- generates some full output page (e.g. diff, show or delete a page, editor, ...)
- formatter
- output functions for different output mimetypes (text/html, text/plain, docbook, ...)
- parser
ParserName(arg1, arg2, ...) + some lines of text -> formatter -> part of a page
- macro
MacroName(arg1, arg2, ...) -> formatter -> part of a paragraph or page
- xmlrpc
- XMLRPC plugins are used for easy remote procedure calls
- filter
is used by indexing search engine to understand various mimetypes (like OpenOffice documents)
- theme
- different skins for web UI
Additionally to the builtin plugins the user can add more plugins to data/plugin/{action,macro,parser,theme,...} directories, on a per-wiki basis.
Modular stuff
This stuff is currently not intended to be extended by a data/plugin/ mechanism. So those are not plugins, but at least made in a modular way:
- auth
- user authentication modules (LDAP, HTTP basic auth, session cookie, PHP session)
- script
since 1.5.3 there is a generic script plugin mechanism, so that we only need 1 moin shell command
- converter
- converts HTML from GUI editor back to e.g. wiki markup
- request
- request methods used by different server types we support (CGI, WSGI, FCGI, mod_python, Twisted, ...)
- i18n
- translations of the system texts to different languages (we use gettext to make *.mo from *.po files) and some i18n related tools
- logfile
- access the log files (writing entries, parsing entries, ...)
- some mail processing (notify as well as mail import)
- security
- security policies (antispam and autoadmin currently)
- server
- some code used by standalone, Twisted and WSGI servers
- stats
- making access statistics, text or graphics
- support
- some stuff we include for convenience, like fixed versions of Python stdlib modules, support for cgi tracebacks or Xapian wrappers
Core Code
Stuff like Page, PageEditor, wikiutil, wikiacl, user, etc. (most stuff located directly under the MoinMoin module) is currently considered core code, as well as some packages distributed within the distribution archive.
Some of that stuff could be also done in a more modular way or as a plugin (we are working on that).
Architecture Details
request
This is where everything begins.
If Moin gets invoked, the first thing it does is to construct a request object, representing the request it is currently processing (including handling of the API differences, bugs and features the different supported servers have).
For example, if you call moin via the moin.cgi CGI script, it will use MoinMoin.request.CGI module to get all needed data from the CGI standard environment variables and store them into request object's attributes. The request object also offers read() and write() functions to read (form) data from the user and write output data to the user. Usually, the form data are available via request.form attribute after request object initialization.
Other request classes include stuff for: FastCGI, Twisted, Standalone Server (based on BaseHTTPServer), WSGI, CLI
Page/PageEditor
These are the classes that currently represent pages (read-only / read-write) that are stored in the wiki and also have the filesystem storage code (this needs to be refactored into a separate mimetype item and storage classes, we will come back to that later).
If you need to read or write wiki page content, you should use either these classes or XMLRPC in your code. Or, you could use PackageInstaller to create/modify pages.
If you access the filesystem directly, it is likely to stop working when we change the storage layout.
user
Usually you access the current user by request.user. What you get is either an anonymous user object (when the user did not log in) or a logged in wiki user.
Using this user object, you can get at this stuff:
- user.valid (True if user has logged in, False for anon users)
- user.name
- user.email
etc. (misc. settings from UserPreferences)
user.may - this is nice to check if a user is allowed to do some specific access, e.g. you can use user.may.read(pagename) to check if this user has read access to the page <pagename>.
wikiutil
As the name suggests, this is a collection of misc. utility functions, which are used all over the place:
- quoting functions for URL and FS
- escaping functions
- query string parsing and generation
action Plugins
An action is a rather flexible operation.
For example, the diff action can be called by requesting this URL:
http://moinmoin.wikiwikiweb.de/WikiSandBox?action=diff&rev1=23&rev2=42
After the basic request object setup, moin finds out quite early that a special action has been called, imports the action handler function from the action package and then just executes that function.
After that, it is completely up to that function what will happen.
In case of the diff handler, it will:
- check user's access rights to the page (read)
- get action parameters from the form (from the URL)
- find the correct page revisions to compare
- emit the usual http headers
- call theme code to send the title area
- output the diff
- call theme code to send the footer area
Other examples for actions: show (most used :), DeletePage, sitemap (emits XML for google sitemap).
So keep in mind, that actions:
- are triggered by requesting a specific URL
have to care for complete output, from http headers over <html> to </html>
- can do anything you want them to do
parser / formatter Plugins
Wiki pages use wiki markup for entering content. MoinMoin stores to disk exactly what you entered in the wiki text editor form.
But a browser can only render html, so someone has to do the transformation from the input format "wiki markup" to the output format "HTML".
Starting with 1.6, we try to use mimetypes to modularize and generalize this, e.g.:
a normal moin wiki markup page has some mimetype like text/moin-wiki
the html output we usually want is text/html
So when the show action is called (or no specific action, as the default for action is show), moin will look at the item you request (e.g. WikiSandBox page), determines its mimetype (will result in text/moin-wiki) and then load MoinMoin.parser.text_moin_wiki to parse the wiki markup.
If you are not requesting some specific output format, it will assumes you want to have some html rendered in your browser (mimetype text/html), so it loads MoinMoin.formatter.text_html as the formatter for that.
The wiki parser will then go through the text line by line and throw a big ugly regex at it to find out what markup is used there. While parsing the input data, it will call the formatter to generate output data. The parser has handler functions for the different kinds of specific markup used, the formatter has some generic functions for the stuff usually needed (make something underlined, bold, make a paragraph, a headline, ...).
For example, if you have some text like "This is __underlined__." (renders as: "This is underlined.") on your page, this is the output of the text_html formatter (comments show what happened):
This is # seen normal text: parser calls formatter.text <span class="u"> # seen __ markup: parser's _u_repl calls formatter.underline and toggles self.is_u underlined # seen normal text: parser calls formatter.text </span"> # seen __ markup: parser's _u_repl calls formatter.underline again and toggles self.is_u . # seen normal text: parser calls formatter.text
Other parsers moin has include: text_rst, text (text/plain and fallback), text_docbook, text_python, text_csv, ... Other formatters moin has include: text_plain, text_docbook, text_python, ...
While text_python parser is a simple highlighting parser for python source code, the formatter text_python is a rather special beast: it formats with python code as target format. The code contains request.write("...") calls for the static content as well as calls to the formatter for dynamic content like macros. After we have all output python code for a page, we byte-compile it and store it to the cache directory. Next time when the same page gets requested, we just load and execute the bytecode - that makes rendering wiki markup as fast as it can get while still being dynamic, when needed.
So keep in mind:
- parsers process some lines (or even a whole page) of input data
- parsers call some formatter to output data
- a formatter translates calls by the parser to some specific output format
macro Plugins
Often you do not want to hack the parsers just to get a tiny bit of additional functionality into your pages - this is why we have macros to embed into wiki pages: they get a (usually small) piece of text as parameters and processes them (usually calling the formatter to output stuff).
The usual syntax is MacroName(arg1, arg2, ...).
A trivial macro is [[BR]] - it just calls the formatter to format a linebreak (text_html formatter returns<BR>) and then returns the result:
macro is the macro object, it has some attributes like macro.request and macro.formatter you usually need.
args is the unprocessed argument string the macro got called with - empty in case of [[BR]].
Of course there are also more complex macros, like RecentChanges, WantedPages or MonthCalendar.
So keep in mind that macros:
- get a few parameters
- render their output within a wiki page
theme Plugins
Moin supports pluggable themes to let you customize how it looks like. A theme is made from:
theme/<themename>.py - some python code
wiki/<themename>/css/*.css - style sheets (colours, fonts, alignment, ...)
wiki/<themename>/img/*.png - icons
Please note that if you want to write your own theme (doing more than just using modern with different colours and icons), this will be a lot of work (initially and also for maintenance over the years) and you have to test it with different browsers (and try to work around some browsers that really suck), check whether left-to-right as well as right-to-left languages are usable with it, check if usability is still there, even on small screens and much more.
Not every theme that looks really cool is also a theme with good usability/compatiblity.
xmlrpc Plugins
Moin implements the wiki XMLRPC APIs v1 (the official stuff, but slightly sucks) and v2 (a slightly different, slightly less official interpretation of the standard).
XMLRPC sounds a bit complicated when you have never seen it working, but it is not. All the complicated stuff (XML and RPC) is done for you by MoinMoin (and Python's xmlrpclib). You just have to write some code and you can call it remotely. Keep security in mind.
An example for an XMLRPC application is the automatic distribution of the BadContent page that keeps the anti-spam patterns. There is also a plugin to generate group definition pages by a script. Wikisync will use XMLRPC as well.
filter Plugins
Filters are simple plugins that help the moin search indexer to index documents. A filter simply gets a filename and has to return file content as unicode object (that doesn't need to be pretty, just the potential search terms should be there).
We currently have filters for:
- MS word and excel (using antiword and xls2csv, works on Linux, not sure about other platforms)
- PDF (using pdftotext, ...)
OpenOffice / Open Document XML formats (pure python)
- JPEG (EXIF stuff)
some strings like application/octet-stream filter used as fallback (it has an internal blacklist, so it does not touch compressed stuff or *.iso etc.)
auth Modules
You can use cfg.auth to configure a list of auth modules. Moin will use that list to call one auth method after the other.
An auth module gets this as input:
- user name and password (on login action)
- whether login or logout is happening (or just a normal page request)
- user_obj from the previous auth method (or None, if there is none yet)
- the request object
And returns:
- a user_obj (or None)
- a flag about whether to continue with auth methods or terminate (usually this indicates "continue" except for some case you maybe want to tell "no, never, no chance for this guy")
converter Modules
Currently there is only one converter module: text_html_text_moin_wiki - we use this together with FCKeditor (the nice GUI editor we use).
This is how it works:
- FCKeditor gets fed with HTML from moin (using the text_gedit HTML formatter, which is quite similar to the text_html formatter except some special handling).
- The wiki user then edits using FCKeditor JS HTML editor as he likes and finally submits his work.
- Moin processes this HTML using the converter to get wiki markup and stores it.
Using this method we could also process other markup, but the first goal is to improve the current converter (it still has bugs and limitations).
A walk through the code
... Show some macros and action ... Write some easy macro or action? e.g. we write some color macro to make coloured text ...
Keep this short.
Future
Moin2 / Google Summer of Code
We are currently mentoring three Google Summer of Code projects, which hopefully will result in some interesting code contributions. While doing SOC (and using code from it), we are working towards MoinMoin 2.0 - in small steps (and 1.6 will be some of those steps).
Here are some of the expected improvements:
Refactoring
We will get a more OO-like design of the core.
We will try to empower moin by exploiting inheritance and generalization and at the same time make it cleaner and simpler. We hope to do more with less code.
We will not have pages and attachments any more, but a hierarchy of mime-type item classes (one mimetype will be text/x-moin-wiki, that will replace what a page is now, a sub-item of mimetype image/jpeg is what an jpeg attachment is now). In that way, attachments will get revisioning for free and pages get up/download for free.
Each item will have meta data (mimetype, language, etc. is stored there). Rendering (and other actions) will depend on the mimetype.
Search
The Xapian search engine is already integrated in the 1.6 branch. It will index all mimetype items supported by filters. We will have an improved search UI exploiting Xapian's features. It will replace the old Lupy approach. Thanks to the work currently done by Franz Pletz (SOC).
Storage
We will have a storage backend API with backend plugins.
One plugin will support old storage (1.3/1.5 style) - mostly for converting your existing data - and a new backend will support new MoinMoin 2 features.
Also, if somebody wishes to implement some Mercurial/SVN/SQL/etc. storage, he can do so by implementing it as a storage plugin.
Thanks to Alex Adranghi.
Wiki Synchronisation
MoinMoin 1.6 will offer a way to syncronise different wiki sites. Changes will be distributed automatically, concurrently changed pages will be merged. There are several use cases:
- Keep your local desktop wiki (which might be used as a notebook) up-to-date with the master server (very important, esp. if somebody wants to use the wiki without an Internet connection, like on a train)
- Mirror another site and allow the visitor to choose which site he prefers (because one might be faster etc.)
- Implement load balancing by replicating the wiki contents to a few servers.
- Allow wiki-like usage where an Internet connection is not available (OLPC?)
Your contribution
If you like MoinMoin, you can help us by contributing code, bug fixes, translations and documentation - sometimes even an idea with a good plan is quite valuable.
There is a long todo and lots of ideas for MoinMoin on the wiki. When the MoinMoin 2 core is starting to work, that todo will even get longer due to new possibilities.
For beginning with MoinMoin, the easiest thing to do is some macro or simple parser or action. While doing that, you can dig deeper (there is lots of sample source code under the MoinMoin module directory) and have more fun.
For bigger work on core code, please stay in contact with the core team and use the MoinMoin site for collaboration. You can use Mercurial SCM to clone our repository on http://hg.moinmo.in/ - regular code contributors will also get write access there.
We need your help!