Differences between revisions 3 and 4

Description

The spellchecker does not catch UnicodeErrors.

Steps to reproduce

Create a dictionary in MM/dict with non-utf8 encoding
Delete dict cache in wiki instance
Run spellcheck
Bombs with stack trace UnicodeDecodeError'utf8 (sorry, i fixed it before i could save the trace)

It should give a better error message like your file isnt in utf8, please see the admin

Example

Details

This wiki.

Workaround

Use iconv words --to-encoding=utf-8 > words-utf8 and delete cache

Discussion

Is the words file part of the system or part of the distribution? If its part of the system, and its always using iso-8859-1, we can accept this encoding. Generally its easy to accept both utf-8 and iso-8895-1 in the same code, using:

words = file(wordsfile).read()
try:
    words = unicode(words, 'utf-8')
except UnicodeError:
    words = unicode(words, 'iso-8859-1')

Plan

Priority:
Assigned to:
Status:

CategoryMoinMoinBug

MoinMoin: MoinMoinBugs/SpellCheckUnicodeError (last edited 2007-10-29 19:20:23 by localhost)

-  ⇤ ← Revision 3 as of 2005-03-05 11:27:17 → 
  Size: 1212
  Editor: NirSoffer
  Comment: better name
+   ← Revision 4 as of 2005-03-05 11:33:18 → ⇥
  Size: 1597
  Editor: NirSoffer
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 37:
+Is the words file part of the system or part of the distribution? If its part of the system, and its always using iso-8859-1, we can accept this encoding. Generally its easy to accept both utf-8 and iso-8895-1 in the same code, using:
{{{
words = file(wordsfile).read()
try:
    words = unicode(words, 'utf-8')
except UnicodeError:
    words = unicode(words, 'iso-8859-1')
}}}

MoinMoin: Diff for "MoinMoinBugs/SpellCheckUnicodeError"