What is WASP?

Early versions of MoinMoin used to use a lot of CPU power, because it parsed every page on every view. WASP is part of MoinMoin.

WASP "compiles" Wikipages into Python and caches the compiled byte code. Because of this every Page must be parsed only once (per edit) but dynamic content (like wiki links, search macros, ...) are still calculated at view time. WASP uses a special Formatter that produces a mix of python code and page content insted of the normal page content.

Because there are a lot of other things that are time consuming (especially loading all the python modules) the benefit of WASP is not too visible when using CGI (may get 30-50% of the python visible time). If MoinMoin is run in a persistent environment (mod_python, fast_cgi, Twisted, standalone) this time is no longer needed and parsing would be the only expensive operation per view.

WASP != WASP !

Calling this patch WASP originally was just a joke by JürgenHermann (see below). This WASP is a patch for MoinMoin that has nothing to do with this software: http://www.execulink.com/~robin1/wasp/readme.html (even though is has similar functionality)

The saga begins

Discussion from MoinMoinTodo/Release 1.1

Implementation log

Now I store the code in an extra list and only insert <<<>>> marks into the page which are replaced afterwards. Double quotes are replaced by \" and the pieces of the HTML source are surounded by request.write(""" and """).

The cached pages are really fast (didn't measure yet). The cached python byte code is slightly bigger than the HTML source of the page content (without header and footer, which both are not cached), which is much bigger than the wiki markup (the default MoinMoin FrontPage source/HTML/WASP 1317/3232/3400 bytes). First benchmarks are promsising. On Pages without expensive Macros it saves 25-50% of the time mesured my request.clock depending on the page size. I seems that the other large part of the time is used for loading python modules.

TW tried WASP with his Twisted branch. Results are great: Pages without expensive macros are rendered 20 times faster (large pages even 30 times). Twisted itself accelerates small pages by factor 10 and has nearly no effect (<<50%) on large pages.

In other words: WASP saves ~50%. Twisted saves ~50%. Together they save ~95%!

Problems:

How to decide what is static

see Dependencies below

There should be a mechanism that every macro and parser can decide itself if it's static or not.

The python formatter gets a list of things to treat as static (e.g [ "page", "namespace" ]) with the constructor. For every macro and processor the formatter checks if it depends only on items from this list. Then the macro/processor is treated as static else as dynamic. Normal markup always is static, the implementation of wiki links will test the static items list.

Dependencies should be set as exact as possible. This is not needed for WASP but could be used a basis for caching on macro basis. This could be macro independed by using the macros' dependencies.

Not implemented:

The formatter maintains a list of dependencies that really where used to allow a caching algorithm to unify several caches. This allows to cache a page as page.all insted of page.en, page.de, page.fr if there was no language dependened item on the page.

Ideas

Dynamic Parser Callback

WASP and the python formatter could be interesting for other parsers, too. The formatter could offer a method which gets a callback method of the parser a parameter. The default formatter would just call this parser method, but the python formatter would only output code for calling it. With this feature every parser could be used in combination with the python formatter.

Dependencies

By now the text_python formatter only distincts between dynamic and static content. But the decision if an item is static or dynamic depends on the caching strategy. Neither the parser nor the formatter nor the macros should have to know much about this strategy. By now it is hard coded in at least one of these three objects. To solve this problem every item (link, macro, text, wikiname) could not only know if it is dynamic or static, but on which facts it depends on. Possible things to depend on could be page content, existing pages (wikilinks), existing interwikipages, formatter, other pages, ... A list of the dependencies would be passed to the formatter with every call (using a sensible default value). The python_text formatter would get a list of things that can be treated a static. With this the text_python formatter could generate everything from a list of all formatter calls to a completely rendered page.

The Dependencies list that each plugin now has is suboptimal. For one, it consists of strings (bad, no one knows what is possible), and secondly it is static. A plugin might create content that is sometimes dependent on one thing, and other times depends on something else. It'd be nice if a new caching framework could take this into account.

I propose to return the Dependencies as a list of classes in MoinMoin.CacheValidity that each support the single method isValid(object). Then, a time class could be written similar to this:

   1 class MoinMoin.CacheValidity.Time:
   2   def __init__(self, maxage):
   3     self.maxage = maxage
   4 
   5   def isValid(self, object):
   6     lastchangetime = object.getLastChangeTime() # get last change of the queried item
   7     return lastchangetime + self.maxage < gettime()

MoinMoin: MoinMoinIdeas/WikiApplicationServerPage (last edited 2007-10-29 19:19:08 by localhost)