Description
RecentChanges does not use index,follow, but index,nofollow.
Details
This Wiki.
Discussion
Currently into all pages but FrontPage, FindPage, SiteNavigation, and TitleIndex, a <meta name="robots" content="index,nofollow"> header is put in order to reduce the machine load and bandwidth loss caused by spiders.
If a searchbot could access the wiki's contents also via the RecentChanges page, it would be easier for it to re-index especially pages that have recently been changed, thus keeping search engine results more up-to-date. Plus, searchbots tend to re-index pages that change often more frequently, and RecentChanges virtually changes any time it is visited, whereas TitleIndex not necessarily changes, even if most of the wiki's contents was changed. This is why I believe RecentChanges to be the most important page for indexing, and should use index,follow, too.
This is a bug to the extent of good support for search engines. If they cannot search a wiki, it will not be found. So I think that we should fix the meta-tag.
I'm sorry, but according to SystemInfo this wiki is patch level 928, nevertheless on RecentChanges there still is the meta name="robots" content="index,nofollow" tag. -- MartinBayer 2005-09-05 16:16:53
You need to use "RecentChanges" for checking, no translation of it.
I believe this should be extended to any translation of RecentChanges in order to make it work even in monolingual or customized wikis. We use to have follow in any translation of TitleIndex, too. This is more or less the same case here.
RecentChanges is very expensive page, and having all localized rc pages indexed and followed is expensive, and does not add new information, since all of them are listing the same changes in other pages.
TitleIndex, SiteNavigation and FindPage are not expensive, but all translation usually point to the same links in the wiki.
Localised FrontPage is usually edited for the wiki, and link to the wiki important pages. translated front pages are usually template pages that does not say anyting about the wiki, and there is no point to visit them.
Suggested fix:
Only the localized version of RecentChanges, FrontPage, TitleIndex, FindPage and SiteNavigaion will have index, follow.
- localised version is a version that use the wiki default_lang
For example, on a Hebrew wiki, פתיחה, שינויים אחרונים, חיפוש, ומפתח דפים will be indexed and followed.
If server load is a concern, noindex would be the right approach, not nofollow. To generate RecentChanges and its translations with nofollow is as expensive as with follow.
The search robot will get only one RecentChanges translation, using the wiki default language. And of course we want it to index this page, as this is the most important page for indexing.
Maybe we should create a special page for indexing, which is cheaper to create then recent changes, for example, contain just a list of links to the pages that was last edited, one link per page, and list few weeks of changes so the robot can use only this page for indexing. Since the log history never changes, we can cache the contents of this page, and simply remove pages from the end of the list when we add new pages on each save operation.
I don't think we should have different pages for human users and robots, respectively (even if I've suggested something like that in FeatureRequests/GoogleSitemapGeneration), because it would be difficult to integrate a page for robots into the structure of the wiki (remember it has to be linked on FrontPage, or at least on SiteNavigation). And we already have a page that contains a list of recently edited pages: RecentChanges. Maybe we should just put an attribute rel="nofollow" (as proposed by several search engines) into any a element related to actions (diff, info...), remove nofollow from the header, and leave RecentChanges like it is? This is IMHO the better approach anyway (see also FeatureRequests/AlternativeSpiderControlFeatures).
The "special page" for search engines can be simply an action, and can be cheaper to produce then RecentChanges. But this is only a speculation, I don't know what make RecentChanges expensive, maybe we can simply make it much faster by caching, like we do for page statistics.
- If the "special page" was an action, from where would it be linked from? However, for now the quickest solution is the best solution...
- From the front page? maybe from a head link?
Plan
Now: apply the suggested fix, using only translated pages of the wiki default lang future: check FeatureRequests/AlternativeSpiderControlFeatures etc.
- Priority:
- Assigned to:
- Status: half fixed in moin--main--1.3--patch-912 - use only en version of the page.
MoinMoin 1.5 seems to have a different behaviour here: When I'm looking on the Google cache version of the front page of a german-only wiki (e.g. ÜberSicht), the link bar is in english (that is RecentChanges, FindPage and so on), hence allowing the search bot to access the english-language RecentChanges (without nofollow) via the front page, even if this one is customized and/or if an other language is the default. Am I missing any point, or can we close this issue? -- MartinBayer 2006-01-22 18:58:15