Template Engine Integration

From design paper

The first step, a simple prototype to get the integration started, has been put together the last few weeks. It appears to MoinMoin as multiple themes, by scanning configured template directories (see Config.genshi_templates) for sub directories, using the name of the subdirectory as a theme name. To completely render a page with such a theme, images and css have to be in place (wiki-instance htdocs) like for regular themes at the moment.

The theme output normally done by ThemeBase was abstracted into a _base.html living under the genshi_templates directory (still needs to be put somewhere more central)

Since normally the output of the actual page content happened outside the theme, output capturing was used in between calls to .send_title and .send_footer, to make the rendering of the page content available to the template.

For the actual rendering with Genshi, a preliminary context was assembled in .send_title and .send_closing_html, to give templates access to useful resources (the actual page, names of standard system pages, user trail, message to be shown, etc.)

... to design problems

To have more control over the output (for themes and also in general) it would be advisable to turn around the way output is written to the output stream. Currently Page and PageEditor output themselves (via .send_page and .sendEditor) and also several actions access the output stream (directly via request.write or indirectly by calling .send_page themselves).

The next step would have been refactoring .send_page (actually killing it in the process) by moving it's code into the distinct actions as needed. The idea was to split .send_page into reusable little methods, maintaining backward compatibility for plugins, while pulling the inner workings into the included actions step by step.

Sadly i failed in this task. The complexity of the actual code flow through even a simple request (showing a page) left me somehow clueless on where to actually start. In the following paragraphs i will try to outline the problems i encountered.

General problems

Some general problems and style issues became apparent while trying to refactor:

Tight coupling: When trying to refactor/rearrange code one often has to realize that this would break a great deal of other code, some even unexpected. Breaking code is fine when refactoring on a large scale, but when trying to concentrate on a small part and clean up things, it's quite cumbersome to follow all the implications.
Side-effect style programming: Many methods all around Moin are programmed in a way, that they don't return any values and the programmer relies on the side-effects applied by those functions. Especially the many set* methods in different areas make it hard to follow. It's often not apparent where a set instance variable comes from. Another example, the use of Request.write everywhere for output, is explained in more detail later in this text.

Page.send_page considered unrefactorable

Page.send_page has been marked for refactoring since 2002. Yet it was never done and instead it even grew more out of proportions. In a quick comparison from 2002 until 2007, it seemed send_page double to tripled in terms of lines of code. Moin has the concept of actions on a Page item, the default action being 'show' but the actual code for displaying them lies almost completely in .send_page. This renders send_page the central all-purpose 'controller' in the model-view-controller concept, making actions merely dispatchers on the parameters to send_page.

The following list gives a short summary of what responsibilities are currently found in .send_page:

Discriminate between a normal page view and the print mode
Count a page view in the Eventlog
Handle page redirects (of pages with 'redirect' processing instruction)
Handle the formatter options for the request
- Instantiate the formatter (type is determined on Page._init_)
- Setup hilighting regex on the formatter (if supplied)
Handle deprecated pages (by adding a informational message for users)
Drive the response/output in general
- Handle response headers if not suppressed via emit_headers=0
- Render theme if not suppressed via content_only
  - Render messages to the user (via msg-keyword)
  - Instruct theme to use print_mode
  - Assign a HTML-id to the content div (via content_id-keyword)
- Render page content (via self.send_page_content and formatter)
- Handle some ACLs (user.may.read) with 'Not allowed'-messages
- Handle remaining output of the FootNote-macro
Handle caching of the page (requested via do_cache keyword)
- Cache normal pages via the text/python-formatter
- Cache links to other pages via the pagelinks-formatter
Handle missing page
- Send MissingPage or MissingHomePage if appropiate

As stated many actions (and some other code) become merely dispatching calls to .send_page, expecting it to handle the many different cases. Since send_page drives so many aspects of the output, it becomes hard to to override or interfere with this behaviour. As a consequence, it seems, modifications to the output behaviour where added to send_page over time instead of refactoring it as it was intended 2002.

The following list may give an impression of how send_page is used from other code:

Action 'show', basically calls send_page with default parameters
Action 'content', uses content_only=1 to suppress theme output
Action 'print', uses print_mode=1
Action 'recall', uses do_cache=0 and count_hit=0
Action 'refresh', empties the cache and then simply calls 'show'
Action 'userform', uses msg to display it's 'savemsg' to the user
Action 'AttachFile', uses msg to display information and the attachment move form
Request.run, uses 'msg' to inform the user about non-existant or not-allowed actions
the editors ...
- use msg to display information to the user (not allowed to edit, conflicts, etc.)
- use content_id to place the preview appropiatly
- the graphical editor uses raw output (actually via send_page_content) to supply FCKEditor with the needed HTML
Macro 'Include', uses content_only=1 for including other pages
Macro 'RandomQuote', uses content_only=1 for including the quote from some quote-page
XMLRPC, uses content_only=1 with redirected output to capture the HTML-version of a page

PageEditor.sendEditor worsens things

With send_page being the central controller for displaying html and actions to the user, sendEditor follows a similar pattern for the single action of 'edit'. Even bigger in proportions and also using send_page for tasks like displaying a preview or giving informational messages to the user, it's also quite hard to follow.

Request.write

The main way for almost every part of MoinMoin to add something to the output is by calling Request.write. The request object is passed around into every corner of MoinMoin and therefore using Request.write is easy and quite obvious in the first place. But still there are several quite negative implications:

HTTP headers have to be output rather early
Having code that uses the results output by other code becomes cumbersome to work with
- Output capturing is needed
- Depending on the situation it gets spread all over the code
Following the output becomes hard
- Calls often don't return anything
- Output generation is a sideeffect of the codeflow

Request.write is called at 365 places (cached byte code not included). Its used in

actions (154 times)
send_page, sendEditor (86 times)
parsers (72 times)
caching code (text_python formatter)
themes (15 times)
RecentChanges
dom_xml formatter
scripts
some other

Putting it into context

In the preceeding sections some design problems have been outlined. All of them affect the integration of new code that is related to output in some way or another. In the following the integration of the template engine will serve as an example of how hard it is to integrate new code/concepts. The different mentioned parts will be set in relation to the mentioned problems:

Using a template engine for HTML output requires the rest of the application to supply a meaningful context for the engine. In a typical web application build in some way or another on the principle of Model-View-Controller, this context is established in the Controller-component. Furthermore output is left exclusively to the View-component, the actual template provided by the template engine (regardless of the actual delivery happening on a lower level at the request level).

Here the first two problems arise:

The controllers in MoinMoin (the actions on a Page) for the most common activities don't provide any meaningful context. As mentioned above most of them are merely dispatchers on the all-embracing Page.send_page. Additionally (as mentioned in the 'Request.write'-section) output to the lowest level is done from higher levels, especially by the formatter/parser combination.

This leaves the programmer integrating with the following choices, all of them flawed in some way or another:

Leave send_page and the controllers untouched and work around the output problem with carefully placed capture calls. This approach is the one persued by the prototype. Establishing the context is somewhat limited since one is contained in the ThemeBase-code. Eventually a more finegrained context could be created but this would need a lot of code already used in other parts of the request stack to be duplicated and evaluated again. Also this approach is not viable if one wants to differentiate between the actions and act differently, since it would lead to big "if ... [elif]* ... else" constructs that will grow out of proportions for every action considered.
Another approach would be to establish a meaningful context in the actions themselves, but leave send_page for backward compatibility and actions not touched in the process. Because of send_page being in such a central position this would lead either to duplicating behaviour found there, according to the needs of the action and the desired context, or to basically doing the same as in the ThemeBase-approach.
Follow the tradition and hooking into send_page directly is an option that is out of question. send_page should not be there anymore in the first place and therefore adding to it's complexity can only lead to even more unmaintainable code.
Refactoring send_page into smaller reusable parts was tried. It turned out quite hard because of the reasons stated under 'General problems' and the chain of dependencies all over the code, though this would enable to pull only the needed parts into the actions. If one was intended to even kill it completely, breakage of existing code would certainly be at a maximum. Much effort would have to be put into recreating the old behaviour in every single action and even some other parts.

Where to go from here

I talked to my mentor, FlorianFesti, and he wanted me to write a design proposal. The proposal should describe a request handling architecture, that would ideally suit MoinMoin. It will be available shortly under ../ReDesign

MoinMoin: FlorianKrupicka/SOC2007/DesignPaper (last edited 2008-05-03 13:59:33 by FlorianKrupicka)