Description
Unlike usual, this is a collection of problems related to Xapian indexing.
Consider it rather as working notes done while experimenting with 1.9 than a formal bug report.
Indexer Testing method
A rather easy way to test the indexing code is to use filesystem files you have anyway:
find /some/base/dir -iname "*" >files.lst # you can also use *.doc or *.pdf to select specific stuff MoinMoin/script/moin.py --config-dir=/configdir --wiki-url=http://wikiurl/ index build --mode=rebuild --files=files.lst
--files indexing
- non-ascii file names lead to unicode related exceptions
a quick fix is to just catch UnicodeError, so it doesn't crash the whole indexing run
- maybe same problem exists for attachments?
- generated "item names" are FS//filename (double slash!?)
pdf filter (poppler ubuntu 8.04)
- hangs infinitely on japanese pdf
- sometimes terminates with rc=1 although it produced quite some filtered text output
rc 127 for external filters
Seems to be the rc if the shell did not find the filter command. But what if it finds the command and the command returns with 127?
Related: PollAboutXapianSearchIndexingFilters