*This is not a discussion; it is an article. If you want to change the article,
please do so, but keep it a flowing, structured article. If you just want to
discuss it or say your opinion, add your comments to the last section,*
Discussion_. *But please don't enter comments inside the article.*
In this text, the term **wiki** refers to any wiki web, not to the software
used to implement such a web; the software is referred to as **wiki engine**.
What is a wiki, and why would we want it to be multilingual? A given mailing
list normally works in a given language. Why would a wiki be different?
The answer seems to be that as a community grows, it spawns sub-communities
that may work in different languages. In such cases, new mailing lists are
created, for example. But for a wiki, things are more complicated, because
unlike mailing list messages, which are transient, wikis contain permanent
articles, and there may be a strong correspondence between articles in
different languages. Thus, when a wiki community spawns a sub-community, it is
normally insufficient to merely create a new, independent wiki. Instead, a
certain amount of interlinking must happen, and it must be supported by the
However, if multilinguality is the result of the creation of sub-communities,
it is important to recognize that these sub-communities shall generally work
independently. Attempting to impose rules about blurring language barriers,
forcing translations to exist, and ensuring a 1-1 correspondence between
articles in different languages, may be counter-productive. Multilingual users
are the glue that joins the sub-communities together in one large loose
community, and if it weren't for them, we wouldn't have anything to discuss
now. Thus, we must care about the needs of multilingual users, and make sure
they can navigate and easily jump from an article to its alternative language
version, but also recognize that a large part of the work occurs independently
in the language sub-communities.
Alternative versions don't contain exactly the same content
If page B is the French version of the English page A, do both pages contain
different language versions of exactly the same content? This is only true if
one of them is a translation of the other. But it may be that it is a
translation of an old version of the other; or that A and B have been written
independently; for example, if the author of B decided to rewrite from scratch
and only consulted A; or if he didn't know that A existed before writing B.
It seems that pages will have the same content only if there exists a formal
policy that dictates it to be so. For example, a company can have a
multilingual web site, and a policy that a page be translated in all designated
languages before being made public. However, such policies don't scale, and
they cause publishing delays which are considered more harmful than
multilingual non-parity. As a result, most companies seem to abandon such
strict policies and only try, but not too hard, to make content available in
all languages of their web site.
Thus, identical content is the norm, at least in theory, only in such cases as
multilingual law, such as Swiss law or EU law. Laws are decisions with a
timestamp, that are finally approved and enforced only after they have been
translated in all designated languages. This need for accurate translation is
part of the formality that results in law making being extremely slow and
hardly comparable to a wiki. Incidentally, despite this slowness, quality of
translation is a major problem in EU law making.
Bijection, commutativity and transitivity
If page F is the French version of the English page E, is E the English version
of F? If F is the French of E, and G is the German of F, is G the German of E?
Is it possible for two different French pages to map to the same English
page? Does one page have only one alternative version in a given language? In
other words, are language mappings bijective_, commutative_, and transitive_?
.. _commutative: http://en.wikipedia.org/wiki/Commutative_operation
.. _transitive: http://en.wikipedia.org/wiki/Transitive_relation
.. _bijective: http://en.wikipedia.org/wiki/Bijection
At first sight these properties seem to hold; but there are exceptions. A
hypothetical example is that in Eskimo there may be many different words for
the English word *snow*, each one having a slight difference in meaning, which
might be important for Eskimo culture. It can be argued that if there were many
different Eskimo snow articles, there could be a separate English version for
each one, whose title or WikiName could be a periphrasis. However, what is
important here is that it is not anymore obvious that bijection still
holds. Another example is when English users decide that page A (whose French
version is B) is too long, and break it in two different articles, A1 and A2.
A real demonstration of the problem is the `list of Star Trek races in
Wikipedia`_. The list contains short comments on each race, but for a number of
races it only refers the reader to a separate article about the race, such as
Borg_ or Vulcan_. The German Wikipedians, however, decided that a single
article is better; they have no separate article for Borgs or Vulcans, every
race being treated in the single `Völker im Star-Trek-Universum`_ article. The
English list of races and the German Völker link to each other, but what about
the English articles Borg_ and Vulcan_? The community is divided as to whether
they should link to Völker or not link at all, and the winning opinion is to
link. Thus, the English Borg_ links to the `Borg section in Völker im
Star-Trek-Universum`_; no reciprocal link exists, of course.
.. _list of Star Trek races in Wikipedia: http://en.wikipedia.org/wiki/List_of_Star_Trek_races
.. _Borg: http://en.wikipedia.org/wiki/Borg
.. _Vulcan: http://en.wikipedia.org/wiki/Vulcan_(Star_Trek)
.. _Völker im Star-Trek-Universum: http://de.wikipedia.org/wiki/Völker_im_Star-Trek-Universum
.. _Borg section in Völker im Star-Trek-Universum: http://de.wikipedia.org/wiki/Völker_im_Star-Trek-Universum#Borg
Another example is the English article on `Degree (angle)`_, which links to the
French Degré_, which links to the English `Degree (disambiguation)`_. The
French `Degré (homonymie)`_ also links to `Degree (disambiguation)`_, which,
suprisingly, does not link to *homonymie* but to Degré_. Although it looks like
an error, `a certain discussion`_ shows that there are some disagreements.
.. _Degree (angle): http://en.wikipedia.org/wiki/Degree_(angle)
.. _Degré: http://fr.wikipedia.org/wiki/Degré
.. _Degree (disambiguation): http://en.wikipedia.org/wiki/Degree_(disambiguation)
.. _Degré (homonymie): http://fr.wikipedia.org/wiki/Degré_(homonymie)
.. _certain discussion: http://en.wikipedia.org/wiki/User_talk:Robbot#Justify_removing_links
It is possible that if the knowledge collected in a wiki were finite and were
perfectly organised, then language mappings would be commutative, transitive,
and bijective, which let's call altogether **perfect**. This means that, as a
wiki evolves, language mappings tend to become perfect (and pages in different
languages tend to the same content). During the process, however, they are
not. It is impossible that English page A and French page B are both broken up
in A1 and A2, and B1 and B2, at the same time; one of them will necessarily be
processed some time before the other; and articles will occasionally be messed
up, as seems to be the case with *Degré* above. Since a wiki does not have a
terminal state and is continuously in the process of evolution, language
mappings will not be perfect. Not that it matters much, but in addition, cases
similar to the above mentioned Star Trek could exist even in the ideal terminal
state, because of cultural differences.
Nevertheless, it is important to note that the vast majority of cases, and by
*vast majority* we mean something like 99% or more, are commutative,
transitive, and bijective. It might, therefore, be preferable to impose it, on
the grounds that it is conceptually simpler. In fact, it appears that both
*Völker im Star-Trek_Universum* and *Degré* have caused discussion and
disagreement, and it might be that everyone would be better off if users had
had no alternative option but the simplest and most intuitive.
There is an additional problem. With nonperfect interlanguage links like
Wikipedia's, each language community is free to do what they want with the
interlanguage links from their language to other languages. Perfect links,
instead, will force a certain degree of intercommunity co-operation: if, in an
ambiguous case, a French user decides that page E1 and not E2 is the English
equivalent of the French F, he imposes this opinion to English users as
well. However, whenever such disagreements exist, all disagreeing users will be
multilingual, which means that they should be able to work out a solution
Different name spaces
Whereas it is clear that "Battle of Normandy" is English and "Bataille de
Normandie" is French, many articles have the same title in many languages;
examples include "Ella Fitzgerald", "Smalltalk", and generally titles that are
names. As a result, different languages must be assigned different name
spaces. Name space here is meant in a general sense, not in the sense of a
One way of implementing language name spaces is with MoinMoin categories, where
all English pages would begin with "En/". An alternative that has been proposed
and used, to put the language in the name of the page, such as
BattleOfNormandyEn, results in ugly names littered with meta-information.
Another alternative, used by MediaWiki, is to use one wiki per language.
Historically, wikis have been using CamelCase_ for hyperlinks. This practice,
however, causes several problems:
* CamelCased terms are recognized by search engines as single words, thus
ranking pages incorrectly.
* CamelCase reduces link readability.
* In several languages, such as Japanese_, Chinese_, Hebrew_, and Arabic_,
CamelCase is not possible.
* Valid CamelCase words have to be escaped in the wiki source.
People with a background in programming languages may find the use of CamelCase
natural, and they might even prefer it for the same reason for which CamelCase
is often used in naming conventions when programming: it indicates that the
entity belongs in a different class. In wikis, however, this is much less
important than in programming, as web browsers render hyperlinks in different
color. As wikis become more available to nontechnical users, CamelCase becomes
less appealing. Sometimes organisations use wikis as their formal web sites,
where unregistered users only have permission to view, whereas logged on users
can modify; in such wikis, CamelCase is clearly undesirable.
If CamelCase is harder for readers, it has some advantages in writing:
* Wikisource can be more readable and closer to the processed result with
CamelCase than with markup.
* CamelCase can be faster to type than markup.
The second advantage, is, however, disputed_. It generally seems that the
disadvantages of CamelCase outweigh the advantages, and there is a tendency to
not use it any more. MediaWiki has dropped CamelCase support altogether.
In multilingual wikis, there isn't much to dispute about CamelCase. If the wiki
is really expected to only be used in certain languages in which there is a
distinction between lower and upper case letters, CamelCase can be used. In
other cases, it should better be discouraged for the benefit of uniformity of
hyperlinking habits in different languages.
.. _CamelCase: http://en.wikipedia.org/wiki/CamelCase
.. _Japanese: http://en.wikipedia.org/wiki/Japanese_writing_system
.. _Chinese: http://en.wikipedia.org/wiki/Chinese_written_language
.. _Hebrew: http://en.wikipedia.org/wiki/Hebrew_alphabet
.. _Arabic: http://en.wikipedia.org/wiki/Arabic_alphabet
.. _disputed: http://en.wikipedia.org/wiki/Wikipedia:CamelCase_and_Wikipedia
See also: http://c2.com/cgi/wiki?WikiWordsConsideredHarmful
Manual language selection
It is occasionally proposed that the wiki server looks at the Accept-Language
http header, or at user preferences stored in cookies, and automatically serve
the preferred language version of the requested page. Such automation is
unwelcome. First, users expect a given URL to point to a page with given
content; cookies may affect the skin, or other details, but not the main
content of the page. For this reason, the Accept-Language header is only ever
used in order to redirect the top-level page of a site, such as
http://www.foobar.com/, to the top-level page in a specific language, such as
Second, users will be confused by such automation. Multilingual users cannot be
expected to change their user preferences each time they want to view a page in
another language. Even if a means of manually selecting a language is provided,
but wiki links are generic, then multilingual users will be frustrated
whenever, by some error on their part, on the wiki engine's part, or on the
page author's part, the wrong language version of a requested page pops up.
There must thus be no such automation; the language must be clear from the URL,
and it must be clear which language version of a page each link points to.
Single language users want RecentChanges to show the changes only in their
language. Multilingual users will either want to view RecentChanges in a given
language each time, or to view the combined changes of a given set of
languages. Besides RecentChanges, there may be other indexes, such as a site
map, or a text search. If it is not possible to provide an option, or until
such functionality is developed, providing single-language indexes seems
preferable to providing indexes of all languages combined, given the general
independence of languages sub-communities.
The most prominent multilingual wiki on the web is Wikipedia_, whose wiki
engine is MediaWiki_. Multilinguality is achieved by assigning one wiki per
language. The `English Wikipedia`_ is almost independent of the `French
Wikipedia`_, them being only connected by manually specified links to
alternative languages. The article on the `Battle of Normandy`_ contains the
markup ``[[fr:Bataille de Normandie]]`` in the wikisource. This is not
rendered; it is only processed by the skin, which adds a link to the French
version of that page, `Bataille de Normandie`_. The French version,
accordingly, contains ``[[en:Battle of Normandy]]``.
.. _Wikipedia: http://www.wikipedia.org/
.. _MediaWiki: http://www.mediawiki.org/
.. _English Wikipedia: http://en.wikipedia.org/
.. _French Wikipedia: http://fr.wikipedia.org/
.. _Battle of Normandy: http://en.wikipedia.org/wiki/Battle_of_Normandy
.. _Bataille de Normandie: http://fr.wikipedia.org/wiki/Bataille_de_Normandie
This system has many advantages. It is simple to develop, easy for users to
understand, free of the questionable assumptions of bijection, transitivity and
commutativity, and provides manual selection of language through clearly
MediaWiki's main disadvantage is that it can be very tedious and error-prone to
manually manage Wikipedia's multilingual links. If an alternative language is
added, not only links to all existing language versions have to be included in
the new version, but links to the new version have to be added in all existing
versions. In articles that exist in more than 50 languages, such as Water_ or
the `article on Wikipedia itself`_, this can be extremely hard. As a
workaround, Wikipedia has bots, like the German ZwoBot_, that periodically
visit all articles and fix the links. In fact, there is a separate bot for each
language; ZwoBot, for example, only fixes German pages: it takes a German page,
follows (recursively) all interlanguage links from it, and then fixes only the
links of the initial German page (it normally makes the unambiguous corrections
and notify the operator about the ambiguous ones). Obviously this causes much
more traffic than if one bot fixed the pages of all languages at the same time,
but it appears that a more conservative approach has been taken, of letting
each language community decide independently how it wants its bot to operate on
the links of the articles of that language.
.. _Water: http://en.wikipedia.org/wiki/Water
.. _article on Wikipedia itself: http://en.wikipedia.org/wiki/Wikipedia
.. _ZwoBot: http://en.wikipedia.org/wiki/User:ZwoBot
Some more disadvantages of MediaWiki are the consequence of the independence of
the wikis for different languages. The most inconvenient for multilingual users
is that different user accounts are required for different languages; another
problem is that the validity of the multilingual links is not checked, as they
are actually external links, created by the information found in the
There are at least two other Wiki engines offering support for multi-lingual content and its synchronization
1. Tikiwiki -> http://wiki-translation.com/CLWE+Demo+Screencast
2. oddmuse -> http://socialsynergyweb.com/cgi-bin/wiki2/Multilingual/FrontPage
Either one wiki for all languages can be used, or a wiki farm (a set of wikis
operated by the same wiki engine installation). With a farm, it is more
difficult to provide one account and combined language indexes. With a single
wiki, it is more difficult to provide separate language indexes. Since
separate indexes are a priority over combined indexes, starting implementing
using a farm seems preferable and easier. Having to configure many wikis is
more an advantage than a disadvantage, as the subcommunities may want different
logos, different default skins, and generally different settings. In addition,
having many wikis served by the same account server, and creating meta-indexes
to show combined recent changes or search all wikis, seem to be cleaner
development solutions than trying to hack functionality into one wiki.
The main problem, then, is to decide whether to force bijection, commutativity
and transitivity. The problem has been analyzed in detail above, and it is hard
to make a choice. Not forcing these properties results in easier implementation
like MediaWiki's, which however causes problems which may need bots to
fix. Forcing them is harder, as it seems to require a subsystem independent of
all languages to keep the language correspondence of the articles, but may be
cleaner in the end.
I think I'd go for bijection, commutativity and transitivity.
`MultilingualWiki in Meatball wiki`_ is a collection of ideas from which this
document has wildly and shamelessly stolen.
The `discussion on Wikipedia's interlanguage links`_ presents a number of
problems which developers should read before attempting to implement
multilinguality in a wiki engine.
`Multilingual communication`_, an article in Wikimedia Metawiki, is an idea
with which we obviously disagree, because it violates all assumptions presented
above, but is linked to from here for completeness.
The `discussion on ZwoBot`_, which were done during revision of this article,
contains some more links.
.. _MultilingualWiki in Meatball wiki: http://www.usemod.com/cgi-bin/mb.pl?MultilingualWiki
.. _discussion on Wikipedia's interlanguage links: http://en.wikipedia.org/wiki/Wikipedia_talk:Interlanguage_links
.. _Multilingual communication: http://meta.wikimedia.org/wiki/Multilingual_communication
.. _discussion on ZwoBot: http://de.wikipedia.org/wiki/Benutzer_Diskussion:Zwobot#Questions_on_ZwoBot_assumptions
Thanks to MoinMoin developers ThomasWaldmann_, AlexanderSchremmer_ and
HeatherStern_ for discussing the subject with me, and to German Wikipedian "Head" for answering my
questions on Zwobot.
Copyright (C) 2005 Antonios Christofides
Permission is granted to copy, distribute and/or modify this document under the
terms of either:
* the Creative Commons Attribution-ShareAlike License 2.5; or
* the GNU Free Documentation License, Version 1.2 or any later version published
by the Free Software Foundation; with no Invariant Sections, no Front-Cover
Texts, and no Back-Cover Texts.
I consider that mediawiki style as an easy, but in fact TOO lightweight
implementation. Using a bot digging through all pages fixing language links is
ugly. A bot makes lots of traffic for a problem that should be solved better.
A wiki farm is better than one wiki for all languages (own namespace, own RC).
The account problem can be solved in other ways, we already have some page
about that. --p54A1FCD7
One wiki means you will get tons of non relevant search results. Its better to have multiple wikis. User account should be per site - not per wiki. Synchronization of pages in different languages can be done when a page is saved. -- NirSoffer_
I have implemented a functionality_ that enables an `Other languages` box in the rightsidebar theme. The way it currently works is that the system is not bijection, commutativity or transitivity. That is, it doesn't reverse lookup the page name in language dictionnaries. -- TheAnarcat_
The discussion seems to point into this direction: Separating languages by using different sites is too far while not separating them is too close. Some layer inbetween seems to be necessary: Something which allows the wiki engine to cut the different language spaces (think of layers) apart but which also keeps the stacks of pages (one stack per object, each page one language) together.
I'd suggest a solution which is based on a "default language". For each object, there must be a page in the default language (configurable; for this example, let's say this is in English). This page contains the processing instruction `lang en`. By configuring the default to `en`, this page automatically becomes the default page for this object. Let's take the page about Water::
##lang de Wasser
##lang fr Eau
Water is absolutely necessary. It's often displayed in ["Fountain"]s.
Below the lang are the names of the pages for this object in the other languages. This way, the wiki knows which pages belong to the same stack (to allow the "Other languages" button). When users want to link to an object, they should use the name of the default page instead of the "real" name. So the Water page in German would look like this::
Wasser ist lebenswichtig. Es wird häufig in ["Fountain"] präsentiert.
During page generation, the engine should consider the stacks and replace the links with the correct ones if a suitable page exists. For example, we have the page "Wasser" in German which would contain a link to "Fountain" (en) which contains the reference to "Brunnen" (de) in the header. When this page is rendered, the markup ["Fountain"] will be replaced with a link to Brunnen (with the title "Brunnen"). In the rendered version of the page, there is no hint to the default page anymore. When you translate a page, you will know which link to use because you can simply copy the text verbatim from the default language page.
If there is no page for Fountain in the desired language German, a copy of the default page for Fountain should be created. In this case, the wiki engine should try to replace all links in that page with the ones to the German content but with the English titles (so the titles match the text but when the user clicks on them, he'll be back in his preferred language).
On the Wasser page, the link to Fountain should read "Fountain (nur in Englisch)" in this case, that is English link title and German error message. The text after the link is supplied by the config.
This way, you can have untranslated "holes" in your wiki without disrupting navigation completely. -- Aaron Digulla
On the `FSFE Wiki`_ the way translated pages are handled is to either find a built-in translation (such as for FrontPage) or to look for pages whose common names are suffixed with a language code. This permits a list of languages to be built and offers the user the opportunity to locate a suitable translation of a page. See the `guide to translated pages`_ for more information and the `advocacy FAQ`_ for an example, noting the languages in the left menu. -- PaulBoddie <<DateTime(2010-03-21T02:58:21+0200)>>
.. _functionality: FeatureRequests/ThisPageInOtherLanguages
.. _FSFE Wiki: http://wiki.fsfe.org/
.. _guide to translated pages: http://wiki.fsfe.org/UserGuide#Translated_pages
.. _advocacy FAQ: http://wiki.fsfe.org/Advocacy_faq_en
It is possible to use a farm, with `SisterSites`_ , to achieve a multilingual wiki.
The pages must have the same name in all languages, which is great for consistency,
but not appropriate for all wikis, especially not encyclopedias ("#redirect" pages
are still possible, but probably a pain to maintain). See
-- FranklinPiat <<Date(2010-03-21T13:48:48+0100)>>
.. _SisterSites: SisterSites
.. _HowTo/MultilingualWiki(sistersites): HowTo/MultilingualWiki(sistersites)
MoinMoin: Creating multilingual wikis and wiki engines (last edited 2012-02-23 13:12:36 by ThomasWaldmann)