Description
moin still relies on the PyXML software, which has seen it's latest release year's ago. please replace this dependency by using the standard python xml library, or some maintained library like lxml or 4suite.
Component selection
- general
Details
MoinMoin Version |
|
OS and Version |
|
Python Version |
|
Server Setup |
|
Server Details |
|
Language you are using the wiki in (set in the browser/UserPreferences) |
|
Workaround
Discussion
DocBook formatter requires python-xml, even under python 2.5 (see MoinMoinDependencies and MoinMoin/formatter/text_docbook.py )
Here is an early patch: no-pythonxml.patch, based on Stefano Zacchiroli's patch. Your feedback is welcome -- FranklinPiat 2009-10-08 21:36:58 :
Thanks for the patch, but can you please tell what the state of affairs is after your patch? What works, what not? -- ThomasWaldmann 2009-10-08 22:17:00
Those are tracked below. -- FranklinPiat 2009-10-11 15:21:05
Known bugs/problems
XML parsers refuse to parse plain text
Bug: Some macros, like <<Hits>> produces plain text (i.e without any markup language). This bug is fixed in the patch below.
Dropping 4suite ?
Since python "now" have a proper xml implementations we might want to drop 4suite's Ft.Xml.Domlette, and use xml.minidom instead.
However, minidom requires a valid XML file as input. (But moinmoin currently produces HTML4 layout, not XHTML : some macro don't close paragraph tags <p>...</p>). As a workaround, we can sanitize generated code with BeautifulSoup.
See the *no-4suite*.patch patches above.
The DocBook files generated by both parsers are very similar, see diff
Entities
At some stage I had the impression that <<RandomQuote>> returned html entities that weren't handled properly. This needs to be tested (confirmed).
Patches
There are two series of patch, exploring two alternatives:
1. Using 4suite
This patch attempts to use the current dependencies (python 4suite).
v1.1a -- no-pythonxml_with_4suite_v1.1a.patch
Bug: This patch doesn't work for pages that contains any macro.
2. Using minidom + BeautifulSoup
Python>=2.4 (?) has XML/DOM implementation. So it should be possible to
no-pythonxml_with_minidom+BeautifulSoup_v1.patch
Bug: the macro <<RandomQuote>> sometimes causes error.
Bug: the parser fails if one plays badly with .
Bug: some <<Include(...)>> fails.
3. Using minidom only
The above patch uses BeautifulSoup to cleanup the xml code generated by the script itself (!).
No patch at the moment
Plan
- Priority:
- Assigned to:
Status: you can use a recent version of python and the builtin xml libs (e.g. 2.5.2) and it should work. for older pythons (2.3, 2.4), you need pyxml. (DocBook export needs pyxml, especially if you use Macro)
- Franklin : As of 2009-11, I gave up working on this issue (mostly because the DOM tree will be refactored in moin-2.0, so it would require some work again).
The DOM tools gets mad when they have to merge the DOM generated by the result of the macro, with the main DOM of the page.
fails with:
but it works if we set st="<p>foo</p>"
Same for xml.minidom:
which fails with:
but it works if we set st="<p>foo</p>"
As a workaround, it is possible to detect when a simple string is returned, and embedded inside <span>...</span>, so minidom/domlette are happy.