Description
MoinMoin does not forbid invalid page names when they're directly entered by the user.
Take this wiki for example. If I enter, literally, http://moinmoin.wikiwikiweb.de:8000/jüdelö in my browser's URL bar (firefox, set to use ISO-8859-15 as its default encoding), then firefox of course requests the URL http://moinmoin.wikiwikiweb.de:8000/j%FCdel%F6.
Obviously, that cannot work on this wiki since it isn't a valid UTF-8 pagename, however the wiki doesn't complain and tells me a page named 'j?' (where the ? is a character I can't see properly) cannot be found. try it for yourself.
Details
This Wiki.
Workaround
- Set your browser to use utf-8 in urls
Create a link on the page, then click the link: jüdelö
Use NewPage macro
Use GoTo box in FindPage
Discussion
Obviously, that behaviour is wrong. When I initially created the new-quoting branch I wanted it do the following:
- try to decode the page name as UTF-8
try to decode the page name in the Accept-Charsets the browser sent
- if all are impossible, fail and give an error message with the opportunity to create a new page by entering the new name on a small form (then the browser has to send the name in UTF-8)
Obviously, the behavior you suggest is wrong. There is no connection between Accept-Charset and the charset the browser use for url encodings. Browsers send url in either utf-8 or another unknown encoding, that can be iso-8859-1 or anything else. You can decode any 8bit encoding as iso-8859-1 or any other 8bit encoding, so there is no point to try any 8bit encoding - all will work but will create junk.
So the behavior you suggest can be:
- Try utf-8
- Show error message with a form for entering new name
There is no standard for url encodings, and browsers behave differently:
- IE on XP uses always utf-8, and on Windows 2000 you can set this also to utf-8.
- Mozilla does not use utf-8 by default, but can be set to use utf-8, try about:config then filter with utf8, then double click network.standard-url.encode-utf8. There is an open bug in Mozilla about that default.
- Safari always use utf-8.
What moin does is split the url to page names, then try to decode each from utf-8, and if thats not possible, decode the pagename with replace, so you get REPLACEMENT CHARACTER (U+FFFD). Finally, the url is joined again. For details, check request.decodePagename
The current solution let you create any valid name - name with REPLACEMENT CHARACTER (U+FFFD) is valid and encoded later using %xx. If you have problem creating names from the url box, either fix your browser configuration, or create a link on the page, then click to create the page, or use the NewPage macro:
or use the GoTo box: