Description

xmlrpc getPage produces broken xml if page contains characters that are not allowed in xml.

Steps to reproduce

  1. Create TestPage containing only some character that isn't allowed in xml, like 0x01

  2. Try to fetch it using getPage

       1 import xmlrpclib
       2 
       3 wiki = xmlrpclib.ServerProxy("http://127.0.0.1:8080/?action=xmlrpc2")
       4 wiki.getPage(u"TestPage")
    

Result:

Traceback (most recent call last):
  File "getpage.py", line 4, in <module>
    wiki.getPage(u"TestPage")
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1150, in __call__
    return self.__send(self.__name, args)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1440, in __request
    verbose=self.__verbose
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1204, in request
    return self._parse_response(h.getfile(), sock)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 1338, in _parse_response
    p.feed(response)
  File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xmlrpclib.py", line 547, in feed
    self._parser.Parse(data, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 5, column 15

Example

MoinMoinBugs/XmlrpcGetPageProducesBrokenXML/TestPage

Component selection

Details

MoinMoin Version

1.6.4

OS and Version

any

Python Version

any

Server Setup

any

Server Details

any

Language you are using the wiki in (set in the browser/UserPreferences)

english

Also, at least 1.8.0 behaves same way.

Workaround

Do not use those characters in wiki pages.

Discussion

This works fine with xmlrpc1 because XmlRpc1._outstr() is doing url_quote to the string. XmlRpc2._outstr() isn't doing any quoting so it ends up producing broken xml.

(!) Can you produce a less "synthetic" use case that triggers this bug (one that might happen with normal/useful page content)?

Well, I don't know any real use for those characters, but I would assume that getPage() should be able to fetch any valid wiki page. So either getPage() should escape/encode those characters somehow or moin should refuse to save pages containing those characters. In my use case, some user has saved page containing 0x1f and I was running automation script which did some conversion stuff to all pages using xmlrpc.

Note:
Valid unicode chars for xml:   #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Plan


CategoryMoinMoinBugConfirmed

MoinMoin: MoinMoinBugs/XmlrpcGetPageProducesBrokenXML (last edited 2009-12-08 23:12:29 by ThomasWaldmann)