Description
Unicode filename does not work in Win32 file system
Since there were many reports similar to this problem, I tested and tested it repeatedly but it still exists.
The error is raised when:
- To upload unicode-named file
- To view a page which contains a unicode-named file (this file is directly copied)
- To list attachments which contains a unicode-named file (this file is directly copied)
Steps to reproduce
- Upload a unicode-named file at a wiki page which saved in Windows file system.
Example
Details
MoinMoin Version |
1.3.5, 1.5.3 |
OS and Version |
Windows 2000, Windows XP sp2 |
Python Version |
Python 2.4.2, 2.4.3 |
Server Setup |
|
Server Details |
Apache w/ ModPython; FastCGI; CGI, IIS |
Client |
Opera, IE, FireFox in Win32 |
Workaround
Do not use unicode-named attachements.
Discussion
I think that it cannot be fixed with a small patch. I'm not good at python or moinmoin but found that win32 uses 'mbcs' as encoder of file system and moinmoin uses 'utf-8' by default. Thus, when you save a file with utf-8 filename encoded from unicode-named in win32, it raises an IOError or UnicodeDecodeError. (BTW it works fine in the other file systems such as this wiki to use unicode-named files.) For example,
1 >>> filename = u'\ud55c\uae00' 2 >>> fn1 = filename.encode('utf-8') 3 >>> fn1 4 '\xed\x95\x9c\xea\xb8\x80' 5 >>> file = open(fn1, 'w') 6 Traceback (most recent call last): 7 File "<stdin>", line 1, in ? 8 IOError: [Errno 2] No such file or directory: '\xed\x95\x9c\xea\xb8\x80' 9 >>> fn2 = filename.encode(sys.getfilesystemencoding()) 10 >>> fn2 11 '\xc7\xd1\xb1\xdb' 12 >>> file = open(fn2, 'w') 13 >>> file.close()
We can resolve this problem if we quote the filename as wiki page name does, but we cannot use file direct-copying. (may cause more problems)
Or we can do if we change the moinmoin to use file system codec (sys.getfilesystemencoding()) to encode the filename but migration among different file systems with different encoding could be difficult or not possible.
-- SeungikLee 2006-05-30 14:40:12
- No this is not necessary.
MoinMoin would need to special case platforms with unicode filesystem semantics. This would be a major change. 1.6 will have another storage method anyway that enforces escaped names, so the solution would be different there. Because of that, I am closing the bug. -- AlexanderSchremmer 2006-05-30 15:08:27
Plan
- Priority:
- Assigned to:
- Status: Will be solved differently in 1.6, too big for 1.5