Description

Xapian gives not optimized search results back. How Xapian search uses the "weight"s or "page_list" for sorting the result list seems not very user friendly / good to me!

Even the MoinMoin built-in search function gives me a better sorted result list back.

/!\ Note: the links are likely invalid - check these bugs on a xapian enabled wiki (see SystemInfo).

Steps to reproduce

Try to search with Xapian and compare the results with the standard moin engine.

Titlesearch

Searching for HelpOnInstalling put the Page "HelpOnInstalling" around position 17.

Result Xapian:

   1. HelpOnInstalling/WikiInstanceCreation (createinstance.sh)

      9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14
   2. HelpOnInstalling/WikiInstanceCreation (createinstance.sh)

      9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14
   3. HelpOnInstalling/TroubleShooting (is_python_here.sh)

      5.4k - Rev: 18 (aktuelle) zuletzt geändert: 2007-09-16 20:23:31
   4. HelpOnInstalling/TroubleShooting (is_python_here.sh)

      5.4k - Rev: 18 (aktuelle) zuletzt geändert: 2007-09-16 20:23:31
   5. HelpOnInstalling/BasicInstallation (pythontest.cgi)

      10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14
   6. HelpOnInstalling/BasicInstallation (pythontest.cgi)

      10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14
   7. HelpOnInstalling/ApacheOnLinuxFtp (explore.py)

      13.3k - Rev: 33 (aktuelle) zuletzt geändert: 2007-09-16 20:23:16
   8. HelpOnInstalling/ApacheOnLinuxFtp (explore.py)

      13.3k - Rev: 33 (aktuelle) zuletzt geändert: 2007-09-16 20:23:16
   9. HelpOnInstalling/AolServer

      1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58

  10. HelpOnInstalling/AolServer

      1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58

Fulltextsearch

A FullTextSearch should also use some weights and counts hits in titles, categories or whatever better than a hit in a text somewhere...

Result Xapian:

HelpOnInstalling/WebLogic . . . 2 Treffer
    ...stall win32con module if you don't have that yet 2. Follow the steps outlined in HelpOnInstalling/BasicInstallation 3. Enable CGI servlet in Weblogic 4. Add wiki webapp for shar...

2.2k - Rev: 10 (aktuelle) zuletzt geändert: 2007-09-16 20:19:47
HelpOnInstalling/WebLogic . . . 2 Treffer
    ...stall win32con module if you don't have that yet 2. Follow the steps outlined in HelpOnInstalling/BasicInstallation 3. Enable CGI servlet in Weblogic 4. Add wiki webapp for shar...

2.2k - Rev: 10 (aktuelle) zuletzt geändert: 2007-09-16 20:19:47
HelpOnInstalling/BasicInstallation (pythontest.cgi) . . . 1 Treffer

10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14
HelpOnInstalling/WikiInstanceCreation (createinstance.sh) . . . 1 Treffer

9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14
HelpOnInstalling/AolServer . . . 1 Treffer
    == Configure Aolservers nscgi module for MoinMoin == Put the following lines in the [[http://www.aolserver.com/|Aolserver]] configuration file, inside the `nscgi` module configura

1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58
HelpOnInstalling/ApacheOnLinuxFtp (explore.py) . . . 1 Treffer

13.3k - Rev: 33 (aktuelle) zuletzt geändert: 2007-09-16 20:23:16
HelpOnInstalling/TroubleShooting (is_python_here.sh) . . . 1 Treffer

5.4k - Rev: 18 (aktuelle) zuletzt geändert: 2007-09-16 20:23:31
HelpOnInstalling/BasicInstallation (pythontest.cgi) . . . 1 Treffer

10.1k - Rev: 54 (aktuelle) zuletzt geändert: 2007-12-23 13:49:14
HelpOnInstalling/WikiInstanceCreation (createinstance.sh) . . . 1 Treffer

9.5k - Rev: 55 (aktuelle) zuletzt geändert: 2007-09-22 07:02:14
HelpOnInstalling/AolServer . . . 1 Treffer
    == Configure Aolservers nscgi module for MoinMoin == Put the following lines in the [[http://www.aolserver.com/|Aolserver]] configuration file, inside the `nscgi` module configura

1.1k - Rev: 5 (aktuelle) zuletzt geändert: 2007-09-16 20:21:58

        1       2       3       4       5       6       7       8       9       1

Component selection

Details

MoinMoin Version

1.6.x

OS and Version

Linux

Python Version

2.5

Server Setup

-

Server Details

-

Language you are using the wiki in (set in the browser/UserPreferences)

-

Workaround

Discussion

Maybe MoinMoin Xapian search uses the default weights and there needs to be some "tuning"...

The default parameter values Xapian uses are k1 = 1, k2 = 0, k3 = 1, and b = 0.5. These are almost certainly not optimal and the best values will vary with the documents being searched, and the type of queries so you can easily tune by using different values for these parameters.

http://www.xapian.org/docs/bm25.html


I generally think that the search results should be improved. I filled out a FeatureRequests with some ideas... don't know if they're even compatible with the index/xapian structur. just take a loot at: http://moinmo.in/FeatureRequests/XapianSearchWithBetterSearchResultList


If you take a look into the xapian.py module; I see there some sort options  def __search  about "page_title" and "weight". so I changed the FullSearch macro to using "weigth" and not "page_title": Line 106:

results = search.searchPages(request, needle, sort='weight')

I run a FullSearch with title only and a real FullSearch. Conclusion is, even with sort page_list there is no sorting by title names. May take a look at My TestPage or -- MarcelHäfner 2008-05-21 20:55:46

Plan


CategoryMoinMoinBug

MoinMoin: MoinMoinBugs/1.6.0XapianWeigthAndSortingOfResults (last edited 2009-10-04 12:51:06 by ThomasWaldmann)