NonAsciiURL-UnicodeEncodeError

I have reinstalled our server to move from 32- to 64-bit. After the upgrade we have also moved from Apache 2.2 to 2.4.

I have moved all the pages and config from the old to the new server. Moved /etc/apache2/conf/wiki to /etc/apache2/sites-available/wiki.conf, and linked it to sites-enabled, and found out that I had to add the following lines:

Alias /moin_static198/applets/FCKeditor/ "/usr/share/fckeditor/"
Alias /moin_static198/ "/usr/share/moin/htdocs/"

This indicates that also MoinMoin has been upgraded.

For the most part everything went fine, except that page names that earlier had non standard ASCII characters, like æøå (norwegian) now throws a "500 internal server error" page. When opening URL: http://hurricane/MyWiki/LøsningerArkitektur

I find the following in /var/log/apache2/error.log on the server, see traceback section further down.

I have tried to change config in both Apache, PHP and moin config, to use UTF-8 everywhere, and changed it to only use iso-8859-1 everywhere, to no avail, same problem...

I have also tried to downgrade the MoinMoin version on the 64-bit server to version 1.9.4-8+deb7u2, that was on the 32-bit version. Still have the same problem.

Have searched the nett for solutions, but the suggested solutions I find is to update the Python code, and that is way beyond my comfort zone (probably it will make thing break even more...)

Steps to reproduce

Everything works fine on a 32-bit debian, with apache 2.2, php5, and MoinMoin 1.9.4-8 without utf-8 enabled. (Seems like apace2 has lang set to "C")

Have not tried to reproduce, but if you should have a try I would think this will do it.

Example

Sorry no public URL for this. See internal URL earlier in bugreport. Attaching a screenshot.

[ATTACH]

Component selection

Details

[Thu Nov 26 22:07:55.994586 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215: 2015-11-26 22:07:55,994 INFO MoinMoin.log:151 using logging configuration read from built-in fallback in MoinMoin.log module
[Thu Nov 26 22:07:55.994673 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215: 2015-11-26 22:07:55,994 INFO MoinMoin.log:157 Running MoinMoin 1.9.8 release code from /usr/lib/python2.7/dist-packages/MoinMoin
[Thu Nov 26 22:07:56.047018 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215: 2015-11-26 22:07:56,046 WARNING MoinMoin.web.flup_frontend:148 No flup-package installed, only basic CGI support is available.
[Thu Nov 26 22:07:56.155835 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215: 2015-11-26 22:07:56,155 ERROR MoinMoin.web.frontend:46 An exception occurred while running CGIFrontEnd
[Thu Nov 26 22:07:56.155885 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215: Traceback (most recent call last):
[Thu Nov 26 22:07:56.155931 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:   File "/usr/lib/python2.7/dist-packages/MoinMoin/web/frontend.py", line 39, in run
[Thu Nov 26 22:07:56.155960 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:     self.run_server(application, options)
[Thu Nov 26 22:07:56.156007 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:   File "/usr/lib/python2.7/dist-packages/MoinMoin/web/flup_frontend.py", line 159, in run_server
[Thu Nov 26 22:07:56.156035 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:     return WSGIServer(application).run()
[Thu Nov 26 22:07:56.156078 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:   File "/usr/lib/python2.7/dist-packages/MoinMoin/web/_fallback_cgi.py", line 69, in run
[Thu Nov 26 22:07:56.156111 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:     result = self.application(environ, start_response)
[Thu Nov 26 22:07:56.156152 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:   File "/usr/lib/python2.7/dist-packages/werkzeug/wsgi.py", line 567, in __call__
[Thu Nov 26 22:07:56.156189 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215:     cleaned_path = cleaned_path.encode(sys.getfilesystemencoding())
[Thu Nov 26 22:07:56.156238 2015] [cgi:error] [pid 6739] [client 10.2.6.110:34810] AH01215: UnicodeEncodeError: 'ascii' codec can't encode character u'\\xf8' in position 2: ordinal not in range(128)

Output of locale:

hurricane:~# locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

And list of generated locales:

hurricane:~# cat /etc/locale.gen  |grep -v "^#"
en_US ISO-8859-1
en_US.ISO-8859-15 ISO-8859-15
en_US.UTF-8 UTF-8
nb_NO ISO-8859-1
nb_NO.UTF-8 UTF-8

MoinMoin Version

1.9.8

OS and Version

Debian 8.2

Python Version

2.7.9

Server Setup

Web, nagios, cacti, +++ Not sure what you want

Server Details

16 cores, 24 GB ram, plain debian 64-bit last updatede stable

Language you are using the wiki in (set in the browser/UserPreferences)

nb_NO Norsk Bokmål

Workaround

I have noe suggestions, other then to rename all pages in the wiki pages root, but we have a loth of pages, and all linkes insed the wiki will be a pain to update....

Discussion

Nowadays one usually has utf-8, try that. Editing php configs is pointless. Also, install a utf-8 locale, like en_US.UTF-8.

FrodeJemtland: Ok, I see I have forgot to add information about the locale lang settings. Added. When you say try that about utf-8, I'm puzzled. Is that after I presumably installed the utf-8 locale. If else I already wrote:

"I have tried to change config in both Apache, PHP and moin config, to use UTF-8 everywhere...(snip)... to no avail, same problem."

Any pointers to any where else I should change this?

It seems to me that it might be in the delivery to the Python script that runs the frontent. Is it inside here you mean? I'm not familiar with the Python code, so I would be happy to get some pointers...

someoneother: I had the same problem. It seems that the WSGI daemon is clearing the locale environment. I had to explicitly set the locale settings for the WSGIDaemonProcess: WSGIDaemonProcess moin user=moinuser group=moinuser processes=5 threads=10 maximum-requests=1000    umask=0007 lang='de_DE.UTF-8' locale='de_DE.UTF-8'

FrodeJemtland: That sounds interesting. Where did you set this?

The line above looks like a line in the Apache web server configuration. This totally depends on how you integrate MoinMoin in the web server (see Wiki for various ways). If it is done via mod_wsgi, then there can be a WSGIDaemonProcess line. Maybe you can mention how you you have setup MoinMoin with Apache. -- 188.192.149.15 2015-12-04 20:12:45

someoneother: Yep, that is specified in the virtual host section for my moinmoin wiki in the apache.conf. I use mid_wsgi, thefore I do already have this WSGIDaemonProcess specified (since I am running the wiki as a separate user). It seems that WSGI is causing the problem.

FrodeJemtland: Ok, thanks for the suggested solutions. I couldn’t make this to work, so I just renamed all the pages to not have he special Norwegian characters in them. We only have 3 special characters, so it was not that big of a job. If anybody else stumbels into this page, I did the following.

Go to your data/pages directory. Run the following to get a output you can past to your console to replace (c385) (witch is Å) with AA:

for i in `ls |grep "(c385)"`; do new="`echo $i |sed 's/(c385)/AA/g'`"; echo "mv \"$i\" \"$new\""; done

Then repeat this for Ø=c398, æ=c3a6, ø=c3b8, å=c3a5. I didn't have any pages with Æ. To find out if you have any other strange pages I executed

ls |grep "(....)"

, and checked if all the remaining pages worked.

Now "all" we have to do is go through all the linkes that has wrong pointers..... Have no good script for that, sorry. Running through all current pages, searching for PageList and [[ lines including the special characters that has been changed.

Plan


CategoryMoinMoinNoBug

MoinMoin: MoinMoinBugs/1.9.8NonAsciiURL-UnicodeEncodeError (last edited 2016-01-06 23:25:59 by ThomasWaldmann)