Description

xmlrpc getPage produces broken xml if page contains characters that are not allowed in xml.

Steps to reproduce

Create TestPage containing only some character that isn't allowed in xml, like 0x01
Try to fetch it using getPage
```
   1 import xmlrpclib
   2 
   3 wiki = xmlrpclib.ServerProxy("http://127.0.0.1:8080/?action=xmlrpc2")
   4 wiki.getPage(u"TestPage")
```

Result:

Traceback (most recent call last):
  File "getpage.py", line 4, in <module>
    wiki.getPage(u"TestPage")
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1150, in __call__
    return self.__send(self.__name, args)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1440, in __request
    verbose=self.__verbose
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1204, in request
    return self._parse_response(h.getfile(), sock)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1338, in _parse_response
    p.feed(response)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 547, in feed
    self._parser.Parse(data, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 5, column 15

Example

MoinMoinBugs/XmlrpcGetPageProducesBrokenXML/TestPage

Component selection

xmlrpc

Details

MoinMoin Version	1.6.4
OS and Version	any
Python Version	any
Server Setup	any
Server Details	any
Language you are using the wiki in (set in the browser/UserPreferences)	english

Also, at least 1.8.0 behaves same way.

Workaround

Do not use those characters in wiki pages.

Discussion

This works fine with xmlrpc1 because XmlRpc1._outstr() is doing url_quote to the string. XmlRpc2._outstr() isn't doing any quoting so it ends up producing broken xml.

Can you produce a less "synthetic" use case that triggers this bug (one that might happen with normal/useful page content)?

Well, I don't know any real use for those characters, but I would assume that getPage() should be able to fetch any valid wiki page. So either getPage() should escape/encode those characters somehow or moin should refuse to save pages containing those characters. In my use case, some user has saved page containing 0x1f and I was running automation script which did some conversion stuff to all pages using xmlrpc.

Note:
Valid unicode chars for xml:   #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Plan

Priority:
Assigned to:
Status:

CategoryMoinMoinBugConfirmed

MoinMoin: MoinMoinBugs/XmlrpcGetPageProducesBrokenXML (last edited 2009-12-08 23:12:29 by ThomasWaldmann)