Outline - work in progress
ReDesign of request handling
In DesignPaper some problems of the current architecture of MoinMoin were outlined. To overcome those, core functions and interfaces have to be redesigned, breaking some compatibility in the process. FlorianFesti instructed me to write a proposal on a better architecture for dealing with the request-response cycle in MoinMoin, especially in regards to the output of templated/themed HTML.
An approach for Moin
To reduce tight coupling of components like in current MoinMoin, a new approach would have to promote a clean separation of responsibilities. The responsibilities of typical current MoinMoin artefacts, regarded from what they should be instead of what they are, can be described as follows:
Request: provides an interface to the requested URL, the parameters (GET/POST), the headers (language, content-type, etc.) and the stream to write the output to.
Page: describes a wikipage, it's associated metadata and provides access to raw contents
Parser/Formatter: transformation of input mime-types to output-mimetypes (this one is tightly coupled too currently, but finding a good solution to separate those both will require much research too)
Action: an activity on a single page object
Macro: wiki content expansion, should rely only on it's containing page and parameters for context
PageEditor: becomes an action on a page ( An editor ist an editor and not a Page, it has no Page-nature whatsoever)
User: representation of an authenticated or the anonymous user, his preferences, his history and eventually special rights
Theme or Template: representation of text/html-conformant output (see Parser/Formatter) in a nice and useful surrounding package. Especially the templates are not restricted to provide just different theming, but should also be put to use in parts like the editor or userpreferences (minimize HTML in python code)
To support the connection between the described artefacts some additional artefacts can be put to use:
Dispatcher: A mechanism and policy to dispatch the actual Request to the Action that should act on it. It can also be used to handle errors and determine default actions and parameters on underspecific requests.
Storage: Currently Page provides access to it's storage itself. The SoC 2007 storage refactoring, also work in progress right now, could provide a storage layer independent of the Page-object. It could be queried from wherever needed.
Chrome: The user interface parts of an application that are outside of a windows content area. (see http://www.mozilla.org/xpfe/ConfigChromeSpec.html). In the context of a web application this would refer to the (X)HTML rendering of a page, that is not direct content (e.g. theme or better layout, scripts, stylesheets, system messages). It is also the controlling instance for the rendering of HTML.
Regardless of how those concepts are put together and implemented, the following directive should be followed at all times: Only Request itself, Actions and the Dispatcher write to the output stream
Approach Patterns
In the following two design approaches with their respective advantages and disadvantages are discussed. The mentioned artefacts are put into relation to the specific design. Also examples for the processing of requests are given.
MVC
ModelViewController is a well understand and often used principle in object oriented design. While the current implementation of MoinMoin follows this pattern in some way, the intended decoupling is neglected in a way that produces the problems described in DesignPaper.
Model: Page and also supplementary parts like the Config, Caches, Eventlog act as models. This especially means that they must not affect the processing of the request in any way. They are only queried and written to by the controllers or in rare cases by the view. If necessary additional interfaces on these objects have to be introduced to achieve this separation.
View: The view part is covered on the highest levels by Formatter/Parser, on the lowest levels (for HTML output) by Theme and Template. They act as the representational layer for data. Modifications have to be done to ensure they do not write output to the user but solely return the representation.
Controller: The Dispatcher (which is currently buried in Request.run) and the Actions comprise the controller component. All processing of specific requests is done here. The models are queried and written to appropiatly and the proper view is determined, instantiated and fed with the needed data. They are the only parts that should be allowed to write output to the user. Other responsibilities found here are:
- Checking of ACLs
- Setting response headers appropiatly
- Handling error conditions (action-specific in the action itself, globally in the dispatcher)
Pro & Contra:
Well documented and understood pattern (many resources)
Maps quite easy against the current code of MoinMoin
Simple to follow, since all logic is centralized in the controller-parts
Can become unflexible in regards to the dispatching
bound to one dispatching mechanism with specific semantics at runtime (like Request.run, which dispatches on 'action'-GET-param
- not all actions are specific to a page or item (userprefs, xmlrpc, ...)
very explicit handling of some aspects in the actions (can make the actions very complex)
- ACLs
- [...]
Dataflow (a.k.a Pipes & Filters)
This describes a somewhat different approach. Instead of revolving around the dispatcher and the controlling actions, which act on data, processing is centered around the data itself. The basic image of the process is that of "Pipes and Filters". The incoming request is passed through several handlers that registered themselves centrally for a specific processing task (for example determining the template and additional context data or in the higher levels parsing a wiki page). The handlers themselves hold no state that is data-specific but instead simply augment the data, pass to a bunch of other registered handlers for further processing and return data to the calling handlers accordingly.
To get an idea of how this pattern is implemented and put to use, following bits of source can be consulted. The code from Trac serves as an example for low level request processing and the code from Pocoo shows how this pattern can be continued into the last corner of a wiki web application.
trac/core.py - Trac's implementation of a component model, the basis for the different handlers. Communication and locatability between components is achieved via defined interfaces.
trac/web/api.py - Some interfaces used at the low levels, especially the RequestHandler interface
trac/web/main.py- Trac's entry-point for the handling of an incoming request. Note how the RequestDispatcher actually tries the different registered handlers in turn and lets them decide if they are responsible.
tekisuto/lexer.py - tekisuto is an experiment in parser building for markup based systems like wikis. The Lexer (central registry) passes the data to parse line by line to the Directives (registered handlers), together with a dictionary from the current stacklevel. Matching directives can use the dictionary to carry state between their respective start and end in the nesting of markups, or simply handle the whole markup in the start directive.
tekisuto/../tests/moinlexer.py - This is a simple example for the use of the basic classes. It parses a subset of MoinMoin wiki markup.
Examples for MoinMoin
The examples are simplified here and there (Caching components are left out for example).
Request to show a wikipage:
Request (abstracted into an object) enters the system
a Dispatcher (perhaps local to a wiki in a farm) augments it with data from the config
the Request is passed to every registered RequestHandler
the ViewHandler (corresponding to the 'view'-Action in current implementation) claims that he can satisfy the Request
Dispatcher passes the Request to the ViewHandler
ViewHandler determines the requested Page and fetches it from the storage (not detailed here)
ViewHandler determines the requested representation of the page => 'text/html'
the Page and desired format is passed to every registered FormatHandler
the HTMLFormatter claims that he can satisfy the Request
ViewHandler passes the Page to the HTMLFormatter
[ ... the dataflow continues in the same way to the ParserHandlers ... ]
ViewHandler receives a formatted version of the Page-content from the HTMLFormatter
ViewHandler prepares a content dictionary for the rendering (the page content, perhaps user data, etc.), adds some information to the Chrome (css-stylesheet to include, some javascript) and sets some response headers (content type, cache control, etc.)
Dispatcher receives the content dictionary, the content type and the name of the template to use ('view.html') from the ViewHandler.
It interprets this as a prompt to use the Chrome-HTML-renderer for generation of final output (opposed to receiving raw content from the RequestHandler) and therefore passes on the request, the content dictionary and the template name to the Chrome-component.
Chrome-HTML-renderer uses the config and user-data in the request to determine the theme to use.
Chrome finds and loads 'view.html' and theme-specific templates (a more specialized 'view.html' from the theme directory for example)
Chrome generates HTML from the templates, the content dictionary, the Requests chrome-object and then passes the result back to the Dispatcher
Dispatcher get's control for the last time. The complete response is send to the user agent, final cleanups are carried out.
Where to reduce tight coupling
Problem: the use of Request.write of current code creates strong dependencies for the higher level code to the lower level
Solution: pull the actual output into the Dispatcher. This gives an instant bonus for HTTP header control too. This will mean a great deal of change to the existing code but will prove worthwhile in the future.
Problem: hard-coded creation of HTML in the code makes changes to the HTML hard and unflexible (just think XHTML support)
Solution: use templates and a better programmatic approach. No ".tag(on) ... .tag(off)" but a more DOM structured approach would be feasible (see Genshis Builder-classes). Where applicable template-files should be made overridable by the Themes (RecentChanges for example)
Problem: Theme is used for the creation of small parts of HTML for other code (RecentChanges daybreak and lang-attributes for example). This enforces the creation of Theme-objects for every request and couples internal logic with representational code.
Solution: See directly above. A templated or programmatic approach independent of the Theme used will solve this dependency too. Rendering of themed HTML would be the sole responsibility of the Chrome then.
Problem: Formatter and Parser are currently tightly coupled via a largely markup-oriented protocol with tags opening and closing. In the current implementation Formatter and Parser form a pair of components that cannot be used in an independent manner.
Solution: An intermediate representation would decouple the steps of parsing and formatting and allow for independent use (for example caching could be done between the two steps in a transparent manner). This intermediate representation could also be used to provide a more datastructure based approach to represent wiki content (instead of the event-based parsing-formatting done right now). This would make it easier to develop Parsers that do not map well to markup events (for example a representation of image/*-metadata in wiki-page content). An idea for this approach could incorporate a wiki parser like tekisuto from Pocoo (look above). A wikidom could be created from this parse-stream in the common case or be created programmatically for other kinds of parsers (again, see the genshi builder pattern)
Misc. Notes
- TODO: look at Include-macro und ACLs ... what happens on including a page with non-permissive ACLs?