Description
xmlrpc getPage produces broken xml if page contains characters that are not allowed in xml.
Steps to reproduce
Create TestPage containing only some character that isn't allowed in xml, like 0x01
Try to fetch it using getPage
Result:
Traceback (most recent call last): File "getpage.py", line 4, in <module> wiki.getPage(u"TestPage") File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1150, in __call__ return self.__send(self.__name, args) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1440, in __request verbose=self.__verbose File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1204, in request return self._parse_response(h.getfile(), sock) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1338, in _parse_response p.feed(response) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 547, in feed self._parser.Parse(data, 0) xml.parsers.expat.ExpatError: not well-formed (invalid token): line 5, column 15
Example
MoinMoinBugs/XmlrpcGetPageProducesBrokenXML/TestPage
Component selection
- xmlrpc
Details
MoinMoin Version |
1.6.4 |
OS and Version |
any |
Python Version |
any |
Server Setup |
any |
Server Details |
any |
Language you are using the wiki in (set in the browser/UserPreferences) |
english |
Also, at least 1.8.0 behaves same way.
Workaround
Do not use those characters in wiki pages.
Discussion
This works fine with xmlrpc1 because XmlRpc1._outstr() is doing url_quote to the string. XmlRpc2._outstr() isn't doing any quoting so it ends up producing broken xml.
Can you produce a less "synthetic" use case that triggers this bug (one that might happen with normal/useful page content)?
Well, I don't know any real use for those characters, but I would assume that getPage() should be able to fetch any valid wiki page. So either getPage() should escape/encode those characters somehow or moin should refuse to save pages containing those characters. In my use case, some user has saved page containing 0x1f and I was running automation script which did some conversion stuff to all pages using xmlrpc.
Note: Valid unicode chars for xml: #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
Plan
- Priority:
- Assigned to:
- Status: