Integration of Sphinx and Moin-2

Aim of Integration

Documentation should be easily accessible to all users of MoinMoin, including developers, administrators and users. Integration should achieve the following:

  • Have documentation available on every site using the moin wiki engine, and have it easily accessible through links.
  • Make documentation theme blend with the chosen moin theme so that users know they're on the same site.
  • Let documentation be searchable.
  • Provide a workflow for moving ideas from the documentation -> user and user -> documentation so that users can extend or critique docs.

Additionally, the following goals are desirable:

  • Being able to add ALL documentation to robots.txt (prevents search crawlers from indexing content unrelated to the site).
  • Achieving a split between user/administrator/developer docs within the user interface, and let administrators toggle documentation sections on/off.
  • Integrate with ACLs and current storage backends.

Integration Options

  1. Use Sphinx's inbuilt WebSupport to serve docs at a set location

    This would probably mean having all the docs under /+docs and route requests through WebSupport.get_document. Documentation could be indexed using BaseSearch. Another possible way to do this would be to create a new StorageBackend and store the docs as normal items (with a custom content type like x-moin/sphinx-docs). See the Sphinx docs for more info. This method is also covered in more depth in the "Implementation details" section below.

    Pros:
    • "Free" support for commenting on documentation (.add_comment, etc.). Using comments is in many ways preferable to being able to edit the content directly as changes can be made by a trusted user capable of editing RST and with good communication skills after consensus has been reached on the desired documentation changes. Comments which a user or administrator feel are important enough to warrant a change in the official MoinMoin documentation can be put in the MoinMoin issue tracker or possibly placed on the official moin-2.0 wiki instance manually (currently test.moinmo.in). See the UI part of the implementation section below for some ideas on this.
    • Relatively easy to implement (a few extra views farming out the real work to .get_document and a custom search class inheriting from BaseSearch to integrate with Whoosh).
    Cons:
    • Content isn't explicitly stored in a wiki page and presented as such, meaning that a bunch of added functionality like editing/metadata/revisions aren't possible. It should be noted that changes between versions can still be shown using Sphinx's versionadded directive, though.
    • Requires a few extra tools to manage documentation and associated metadata like comments so that metadata can be transfered between wiki instances/versions and storage backends (if a custom Sphinx websupport storage backend is used. Think moin doc-build, moin doc-load and moin doc-save, a la moin index-build, moin load and moin save.
  2. Automatically populate wiki pages with RST documents from doc/, adding required directives

    Pros:
    • Easy for documentation writers, documents can be edited on the wiki or directly through the repo and merged afterwards.
    • Excellent integration with wiki theme (the docs are editable wiki documents, first class wiki citizens if you will).
    Cons:
    • Requires implementation of large number of Sphinx specific reStructuredText directives, a full list of which can be found here.
    • Auto-documentation of the API is problematic, directives like automodule are expensive to compute and would have to be stored statically.
    • Some directives present a security threat due to path traversal, etc. Sphinx is not explicitly designed to be sandboxed.
  3. Serve static HTML docs produced by make docs

    This method is already implemented - see /+serve/docs/index.html

    This method would simply serve documentation at a certain URI (say, /+docs or /docs) without doing any dynamic computation.

    Pros:
    • Easiest method for developers to implement - documentation is packaged at distribution time, no extra work required.
    • Computationally efficient.
    Cons:
    • Not editable via wiki interface, forces old "edit repo -> push at each release" workflow.
    • Does not look good (if the user changes themes, the documentation doesn't change to fit in with the rest of the wiki).
  4. Automatically populate wiki pages with JSON encapsulated HTML documents produced by make json

    This method is similar to method #2 except Sphinx renders the data at the request of the administrator (it is only rendered once) and this data is used to populate the wiki pages. make json outputs JSON documents containing the individual pieces of HTML markup for the sidebar, navbar, document body, etc., as well as the document metadata (like the title), making it easy to parse and upload the data.

    An alternative implementation for this method would be to create an Sphinx build target for Moin wiki markup (or even reStructuredText) and populate wiki pages with those. This has the advantage of making it easier for the wiki user to edit, as HTML is rather cumbersome and does not translate idiomatically to or from reStructuredText.

    Pros:
    • All content is wiki editable, and integrates with the wiki theme without any extra work.
    • Storage of rendered documentation is easy as this is taken care of by MoinMoin's backend since all of the documentation pages are normal items.
    • Sphinx does the heavy lifting as far as documenting APIs with directives like autodoc is concerned, and lets the end user make minor changes to extend the documentation or fix errors (and without manually editing docstrings!).
    Cons:
    • Nearly impossible to reconcile changes to the wiki pages of documentation and the repository pages. This means that documentation contributors must still edit through the repo if they wish to make permanent changes.
    • No dynamic documentation (so if a documentation change is made, it can't be reflected in the docs without losing changes made to the section by wiki users). This is basically part of the above con.

One-way editing with wiki2sphinx

wiki2sphinx is a piece of software designed to convert wiki pages using reStructuredText markup to Sphinx documentation, allowing documentation authors to write their documentation using the user friendly interface of a wiki but with the advantages of Sphinx distribution. Unfortunately I can't comment on its effectiveness as I could not get it to run on anything - it dies with xmlrpclib.ProtocolError: <ProtocolError for 127.0.0.1:8080/?action=xmlrpc2: 405 METHOD NOT ALLOWED> on my moin-2 instance (which sort of makes sense as I can't find any xmlrpc code in moin-2 using grep ;) ) and I can't find any moin-1.x wikis with reST documentation (I tried running it on HelpForUsers <http://moinmo.in/HelpForUsers, amongst other documentation index pages, and it simply spat out a single page of normal moin wiki markup in a Sphinx directory structure). Assuming such a tool was available for MoinMoin, the following would have to be taken into account when writing documentation in this manner:

  • As per option #2, macros such as automodule are difficult to use with MoinMoin because it's unclear whether or not Sphinx is intended to run sandboxed and because running Sphinx to build automatic API docs is expensive (so results would have to be stored/cached).
  • This could still be useful for user/administrator docs (anything not requiring Sphinx's automatic API documentation capabilities, for that matter) so long as display directives like .. warning::, .. toctree::, etc. are implemented in MoinMoin.

It would still be possible to write documentation without these directives being rendered by Moin (or just being rendered using placeholder text), but it's questionable how much better this is than the current write docs -> build with sphinx -> test -> commit to repo workflow (this is covered in more detail in the implementation section below).

Code-in task questions

It seems to me that on balance, the first option seems to win on ease of implementation, usability and meeting the goals outlined at the beginning of this document. To properly evaluate the solution, I'm going to answer some of the questions posed in the Code-in task which produced this document (slightly rephrased).

[What is the best way to address the availability and separation of] moin docs for users, for developers, for admins?
Hiding administrator documents from users and developers (who do not need to access them) is probably best done with ACLs, and toggling certain documentation sections on/off could be achieved through configuration options (think: available_sections = ['user', 'admin', 'devel',], etc.).
how can we show documentation on the wiki?
Have the documentation appear under a certain root URL (like /+docs) and link to documentation from within the interface (so the interface for modifying a Mediawiki document could link to /+docs/user/mediawiki). The documentation browsing interface would likely be similar to that of Sphinx, thanks to the sidebar/related items/navbar/content HTML supplied by WebSupport.get_document (docs).
how can we modify/extend documentation on the wiki?
The best way would be to have changes made through the filesystem by an administrator, based on comments provided by users through WebSupport.add_comment. See the workflow section in implementation details for more on this.
how can we deal with documentation for different moin releases?

Simply ship the documentation with the each new Moin release. I see no reason to have documentation on each wiki for different versions.

Response to GCI task question: It seems that this was more a question of how MoinMoin deals with having parallel documentation for different versions. It seems to me that the easiest way to do this would be to have a single set of documentation and use .. versionadded::, etc (see below). Old documentation can still be viewed using Mercurial tags, regardless.

how can we show changes between releases?
Using the .. versionadded::, .. versionchanged:: and .. deprecated:: Sphinx directives (documented here). Perhaps a handy feature for administrators would be the ability to place a header on the documentation pages informing the user that the Sphinx instance has been upgraded and give them a link to the CHANGES file.

Implementation details

User interface

The UI should attempt to use as much existing moin CSS/HTML as possible to make the documentation appear better integrated with the general style of MoinMoin. Here is a mockup shot with some ideas.

attachment:sphinx-websupport-index-mockup.png

Users should be able to intuitively jump to/from items using navigation bars and links, and searching should be as easy as possible (with an emphasis on separating MoinMoin's wiki-wide search and documentation search).

Commenting is a feature that Sphinx's websupport comes bundled with, and would be very useful for allowing users to give administrators feedback on their documentation and help other users out (much like a Q&A forum or tips and tricks section). This functionality would be configurable by the administrator, and local to each wiki instance. Here's another mockup showing how comments might look:

attachment:sphinx-websupport-comments-mockup.png

As you can see, the user is able to toggle the comments section on or off by clicking the 2 comments [+/-] button, and are able to view the comments made by users, the date made and have a quick link to the userpage. This mockup is based on Real World Haskell's system.

Document/comment storage

By default Sphinx's websupport module stores documentation in pickle files and unpickles them whenever an item is accessed. This is very slow when there is a large volume of requests because it has to access the disk and decode the data each time, and this process can be sped up in several ways:

  • Extend browser cache time

    Documentation changes slowly, so extending cache time to five or ten minutes certainly wouldn't hurt if the wiki has a lot of users.

  • Cache documents server side

    Using app.cache, documents can be fetched from the server's cache instead of being fetched from the filesystem each time. This is good for preventing DoS on systems with slow disks and lots of users, but can be somewhat problematic when caching VERY large pieces of documentation (AFAICT MoinMoin doesn't have anything big enough to cause problems on most systems).

  • Write a custom backend using MoinMoin stores

    This does not provide any real benefit for users using filesystem stores with MoinMoin, but can speed things up for more complex databases like MySQL Sphinx provides StorageBackend for this task, which allows the developer to override the storage system used by Sphinx (see docs here). This method would require the documents to be stored in a MoinMoin store (rather than a backend) and indexed manually for search (see below).

User/admin documentation workflow

As written above in option #2, implementing all the Sphinx directives to work on all of the Moin documentation in the wiki is virtually impossible, but that's not to say that MoinMoin can't be used to create the user and administrator documentation which does not require use of the Sphinx's automatic API documentation functionality. MoinMoin master is the wiki used for maintaining and translating documentation for moin-1.x, and I can't see any reason this approach would not work for moin-2.x (except extra care would have to be taken to disallow special Sphinx directives, or sandbox them so that the box being used to build the Sphinx docs is not vulnerable to attack through specially crafted directives placed by users of the MoinMoin master wiki). Such a workflow would look like:

Users write documentation on MoinMoin master wiki

Documentation written on master wiki is imported into MoinMoin repo's Sphinx docs/user/ and docs/admin/ directories and built for distribution

Users leave comments on official MoinMoin master documentation or edit it themselves → These changes are implemented by administrators and developers

Documentation is built again, and the cycle starts again.

This allows documentation contributors to work on the documentation without dealing with the complexities of Mercurial and Sphinx, and simplifies translation (next section).

Multilingual documentation

MoinMoin currently uses a master wiki for managing user documentation and translation of this documentation (see the translation index for a neat interface to check translated documentation). Assuming the workflow in the above section is used, translation could be done in the same fashion as moin-1.x, with the translated documentation imported into the MoinMoin repo along with the standard English documentation. The RST documents in the MoinMoin repository not edited through the master wiki (indices and API docs) would also have to be translated manually.

Sphinx does not have any special way of handling documentation in different languages, but projects like Bazaar and CakePHP have gotten around this using custom a custom Makefile and a documentation root for each language, so the doc/ directory looks like (copied from a CakePHP developer's blog post).:

doc/
  ├── Makefile
  ├── _templates/
  │   └── // custom templates here
  ├── _build/
  ├── config/
  │   ├── __init__.py
  │   └── all.py
  ├── en/
  │   ├── Makefile
  │   ├── _static
  │   ├── .. rest of the documentation here.
  │   ├── conf.py
  │   └── index.rst
  └── es/
      ├── Makefile
      ├── _static
      ├── conf.py
      ├── .. rest of the documentation here
      └── index.rst

This seems to be the best (and possibly only?) way to get translated documentation in Sphinx. The root Makefile simply loops over each of the language subdirectories and runs their (Sphinx generated) Makefile. The "phraseanet" project has done this too, here is their documentation repo and the resulting Sphinx documentation (in English + French).

Alternative implementation - using Moin storage backend

Note

Implementation options #2 and #4 above are really just variations on this same theme, there is also the sections on Wiki2Sphinx and documentation workflow which contain some notes on using MoinMoin's Items as documentation and how that works with Sphinx

Using Sphinx websupport's StorageBackend class, storage for the documentation can be overridden. By implementing the StorageBackend.get_data method, complete control can be exerted over the documentation served through the WebSupport class. By tying this in with a moin-2 Item, documentation can be edited/destroyed very easily (and even created with the StorageBackend.add_node method). The easiest way to implement this (assuming commenting support is required) would be to create a new content type (say, x-moin/sphinxdoc or something similar) and store comments + documentation in HTML form (as this is how WebSupport supplies it) as the item data, and store useful information like document title in the metadata.

There are still some issues with the approach, including:

  • The wiki administrator may not wish documentation to be indexed with Whoosh along with the rest of their wiki items, meaning that a separate Whoosh index must be maintained for documentation items.
  • Sphinx websupport uses HTML rather than RST, so two-way writing is really impossible using this method (though as the note above says, there are some alternate ways to get around this).

Other notes

It should be noted that documentation search should be kept as far away as possible from normal item search so as not to confuse the users, and all documentation should be added to robots.txt to prevent it from being indexed by search engines and throwing up irrelevant results when users try and search for the main MoinMoin documentation site.

On using MoinMoin wiki items as input data for Sphinx: Populating the documentation with wiki pages would have to be done pre-build with a tool like Wiki2Sphinx.

If you really wanted to have control over the build process, you could write a Sphinx extension to download external resources to the filesystem and have the Sphinx builder process them then, but this is basically equivalent to running a Wiki2Sphinx-like script in the documentation Makefile and having it purge the produced files post-build.

The Sphinx extension documentation is at http://sphinx.pocoo.org/extensions.html, it might be possible to trap the builder-inited event (http://sphinx.pocoo.org/ext/appapi.html#event-builder-inited) and download some files then, but this wouldn't have any significant advantages over writing your own Wiki2Sphinx script for moin-2.x and running it in the Makefile.

MoinMoin: SamToyer/SphinxMoinCooperation (last edited 2012-01-05 12:11:04 by SamToyer)