Description
Xapian gives not optimized search results back. How Xapian search uses the "weight"s or "page_list" for sorting the result list seems not very user friendly / good to me!
Even the MoinMoin built-in search function gives me a better sorted result list back.
Note: the links are likely invalid - check these bugs on a xapian enabled wiki (see SystemInfo).
Steps to reproduce
Try to search with Xapian and compare the results with the standard moin engine.
Titlesearch
Searching for HelpOnInstalling put the Page "HelpOnInstalling" around position 17.
Result Xapian:
1. HelpOnInstalling/WikiInstanceCreation (createinstance.sh) 9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14 2. HelpOnInstalling/WikiInstanceCreation (createinstance.sh) 9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14 3. HelpOnInstalling/TroubleShooting (is_python_here.sh) 5.4k - Rev: 18 (aktuelle) zuletzt geändert: 2007-09-16 20:23:31 4. HelpOnInstalling/TroubleShooting (is_python_here.sh) 5.4k - Rev: 18 (aktuelle) zuletzt geändert: 2007-09-16 20:23:31 5. HelpOnInstalling/BasicInstallation (pythontest.cgi) 10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14 6. HelpOnInstalling/BasicInstallation (pythontest.cgi) 10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14 7. HelpOnInstalling/ApacheOnLinuxFtp (explore.py) 13.3k - Rev: 33 (aktuelle) zuletzt geändert: 2007-09-16 20:23:16 8. HelpOnInstalling/ApacheOnLinuxFtp (explore.py) 13.3k - Rev: 33 (aktuelle) zuletzt geändert: 2007-09-16 20:23:16 9. HelpOnInstalling/AolServer 1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58 10. HelpOnInstalling/AolServer 1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58
The wanted page HelpOnInstalling is around position 17. But it should be one of the first, 'coz a 100% title hit.
The Moin Standard Search gives a better result list.
Fulltextsearch
A FullTextSearch should also use some weights and counts hits in titles, categories or whatever better than a hit in a text somewhere...
Result Xapian:
HelpOnInstalling/WebLogic . . . 2 Treffer ...stall win32con module if you don't have that yet 2. Follow the steps outlined in HelpOnInstalling/BasicInstallation 3. Enable CGI servlet in Weblogic 4. Add wiki webapp for shar... 2.2k - Rev: 10 (aktuelle) zuletzt geändert: 2007-09-16 20:19:47 HelpOnInstalling/WebLogic . . . 2 Treffer ...stall win32con module if you don't have that yet 2. Follow the steps outlined in HelpOnInstalling/BasicInstallation 3. Enable CGI servlet in Weblogic 4. Add wiki webapp for shar... 2.2k - Rev: 10 (aktuelle) zuletzt geändert: 2007-09-16 20:19:47 HelpOnInstalling/BasicInstallation (pythontest.cgi) . . . 1 Treffer 10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14 HelpOnInstalling/WikiInstanceCreation (createinstance.sh) . . . 1 Treffer 9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14 HelpOnInstalling/AolServer . . . 1 Treffer == Configure Aolservers nscgi module for MoinMoin == Put the following lines in the [[http://www.aolserver.com/|Aolserver]] configuration file, inside the `nscgi` module configura 1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58 HelpOnInstalling/ApacheOnLinuxFtp (explore.py) . . . 1 Treffer 13.3k - Rev: 33 (aktuelle) zuletzt geändert: 2007-09-16 20:23:16 HelpOnInstalling/TroubleShooting (is_python_here.sh) . . . 1 Treffer 5.4k - Rev: 18 (aktuelle) zuletzt geändert: 2007-09-16 20:23:31 HelpOnInstalling/BasicInstallation (pythontest.cgi) . . . 1 Treffer 10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14 HelpOnInstalling/WikiInstanceCreation (createinstance.sh) . . . 1 Treffer 9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14 HelpOnInstalling/AolServer . . . 1 Treffer == Configure Aolservers nscgi module for MoinMoin == Put the following lines in the [[http://www.aolserver.com/|Aolserver]] configuration file, inside the `nscgi` module configura 1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58 1 2 3 4 5 6 7 8 9 1
- if you look at the second page, you will see some items with also 3 hits and more...
The MoinMoin Standard Search gives at least a list sorted by hits...
Component selection
MoinMoin Xapian Search
Details
MoinMoin Version |
1.6.x |
OS and Version |
Linux |
Python Version |
2.5 |
Server Setup |
- |
Server Details |
- |
Language you are using the wiki in (set in the browser/UserPreferences) |
- |
Workaround
- dont' use xapian or live with it (like I do)
Discussion
Maybe MoinMoin Xapian search uses the default weights and there needs to be some "tuning"...
The default parameter values Xapian uses are k1 = 1, k2 = 0, k3 = 1, and b = 0.5. These are almost certainly not optimal and the best values will vary with the documents being searched, and the type of queries so you can easily tune by using different values for these parameters.
http://www.xapian.org/docs/bm25.html
I generally think that the search results should be improved. I filled out a FeatureRequests with some ideas... don't know if they're even compatible with the index/xapian structur. just take a loot at: http://moinmo.in/FeatureRequests/XapianSearchWithBetterSearchResultList
If you take a look into the xapian.py module; I see there some sort options def __search about "page_title" and "weight". so I changed the FullSearch macro to using "weigth" and not "page_title": Line 106:
results = search.searchPages(request, needle, sort='weight')
I run a FullSearch with title only and a real FullSearch. Conclusion is, even with sort page_list there is no sorting by title names. May take a look at My TestPage or -- MarcelHäfner 2008-05-21 20:55:46
Plan
- Priority:
- Assigned to:
- Status: keeping the bug open as reminder for further improvements
should be better now after http://hg.moinmo.in/moin/1.7/rev/810eb9c0f79a
another small improvement: http://hg.moinmo.in/moin/1.8/rev/b072085c6acd