Description
The SyncPages action fails on pages with names containing non-ascii characters. (Unicode page content transfers normally.) This behaviour is encountered in Moin 1.9.3 with Python 2.7.
Steps to reproduce
Create a page with a name containing non-ascii characters (e.g., СудалгааныТанилцуулга)
Create a sync job page with a pageMatch or pageList field matching that page (e.g., pageList:: СудалгааныТанилцуулга)
Attempt to run the sync (using action=SyncPages)
Example
See traceback output below.
Component selection
SyncPages.py
- MoinMoin/wikisync.py
- xmlrpc
Details
2012-01-11 12:11:44,458 INFO MoinMoin.web.serving:41 127.0.0.1 "GET /SyncTest?action=SyncPages HTTP/1.1" 200 - 2012-01-11 12:11:56,230 ERROR MoinMoin.wsgiapp:293 An exception has occurred [http://localhost:8080/SyncTest?action=SyncPages]. Traceback (most recent call last): File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 282, in __call__ response = run(context) File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 88, in run response = dispatch(request, context, action_name) File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 136, in dispatch response = handle_action(context, pagename, action_name) File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 195, in handle_action handler(context.page.page_name, context) File "/home/edt/moinmoin/MoinMoin/action/SyncPages.py", line 511, in execute ActionClass(pagename, request).render() File "/home/edt/moinmoin/MoinMoin/action/SyncPages.py", line 220, in render self.sync(params, local, remote) File "/home/edt/moinmoin/MoinMoin/action/SyncPages.py", line 279, in sync r_pages = remote.get_pages(exclude_non_writable=direction != DOWN) File "/home/edt/moinmoin/MoinMoin/wikisync.py", line 286, in get_pages tokres, pages = m() File "/usr/lib/python2.7/xmlrpclib.py", line 997, in __call__ return MultiCallIterator(self.__server.system.multicall(marshalled_list)) File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1575, in __request verbose=self.__verbose File "/usr/lib/python2.7/xmlrpclib.py", line 1264, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib/python2.7/xmlrpclib.py", line 1292, in single_request self.send_content(h, request_body) File "/usr/lib/python2.7/xmlrpclib.py", line 1439, in send_content connection.endheaders(request_body) File "/usr/lib/python2.7/httplib.py", line 951, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 809, in _send_output msg += message_body UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 550: ordinal not in range(128)
MoinMoin Version |
1.9.3 |
OS and Version |
Ubuntu 11.10 |
Python Version |
2.7 |
Server Setup |
wikiserver.py |
Server Details |
|
Language you are using the wiki in (set in the browser/UserPreferences) |
English |
Workaround
Use ascii-only pagenames.
Discussion
- xmlrpclib getPage can download a non ascii filename
It works with python 2.6
If one debugs this with pydev please read PyDev, Python and system default Unicode encoding problem
Something like this could solve it.
diff -r 56eaf32027f4 wikiserver.py --- a/wikiserver.py Tue Feb 07 21:48:50 2012 +0100 +++ b/wikiserver.py Fri Feb 17 22:30:22 2012 +0100 @@ -7,6 +7,8 @@ """ import sys, os +reload(sys) +sys.setdefaultencoding("utf-8") # a) Configuration of Python's code search path # If you already have set up the PYTHONPATH environment variable for the
http://stackoverflow.com/questions/3828723/why-we-need-sys-setdefaultencodingutf-8-in-py-scipt
or at
diff -r 56eaf32027f4 wikiconfig.py --- a/wikiconfig.py Tue Feb 07 21:48:50 2012 +0100 +++ b/wikiconfig.py Fri Feb 17 22:40:50 2012 +0100 @@ -8,7 +8,7 @@ import sys, os -from MoinMoin.config import multiconfig, url_prefix_static +from MoinMoin.config import multiconfig, url_prefix_static, charset class LocalConfig(multiconfig.DefaultConfig): @@ -43,7 +43,8 @@ # Add your configuration items here. secrets = 'This string is NOT a secret, please make up your own, long, random secret string!' - + reload(sys) + sys.setdefaultencoding(charset) # DEVELOPERS! Do not add your configuration items there, # you could accidentally commit them! Instead, create a # wikiconfig_local.py file containing this:
Changing the default encoding is evil. We should rather fix wrong data types (str vs. unicode) so it does not need to do implicit encoding/decoding. -- ThomasWaldmann 2012-02-18 15:18:27
diff -r 1ddf7d88c53d MoinMoin/wikisync.py --- a/MoinMoin/wikisync.py Thu Mar 01 00:15:41 2012 +0100 +++ b/MoinMoin/wikisync.py Sun Mar 04 21:00:10 2012 +0100 @@ -166,7 +166,7 @@ _ = self.request.getText wikitag, wikiurl, wikitail, wikitag_bad = wikiutil.resolve_interwiki(self.request, interwikiname, '') - self.wiki_url = wikiutil.mapURL(self.request, wikiurl) + self.wiki_url = wikiutil.mapURL(self.request, wikiurl.encode("utf-8")) self.valid = not wikitag_bad self.xmlrpc_url = self.wiki_url + "?action=xmlrpc2" if not self.valid:
We can encode the url too
in 2.7 httplb does
if isinstance(message_body, str): msg += message_body message_body = None self.send(msg)
it assumes that if message_body is instance of str that also msg is from the same type. This is not in our current code. Without the change the url is unicode.
In syncpages we have often hardcoded utf-8 for decoding and not config.charset. Also on lots of other places we encode urlparts.
Please paste your full wikisync config page. -- AlexanderSchremmer
Proposed solution (please try it, didn't test it):
diff -r 1ddf7d88c53d MoinMoin/wikisync.py --- a/MoinMoin/wikisync.py Thu Mar 01 00:15:41 2012 +0100 +++ b/MoinMoin/wikisync.py Sun Mar 04 21:00:10 2012 +0100 @@ -166,7 +166,7 @@ _ = self.request.getText wikitag, wikiurl, wikitail, wikitag_bad = wikiutil.resolve_interwiki(self.request, interwikiname, '') self.wiki_url = wikiutil.mapURL(self.request, wikiurl) self.valid = not wikitag_bad - self.xmlrpc_url = self.wiki_url + "?action=xmlrpc2" + self.xmlrpc_url = str(self.wiki_url + "?action=xmlrpc2") # url MUST be str. unicode would lead to msg_body decoding issues in py 2.7 httplib. if not self.valid:
Plan
- Priority:
- Assigned to:
Status: fixed with http://hg.moinmo.in/moin/1.9/rev/4541d744d740