Description
When you do an anchor macro where the name of the anchor contains an underscore, the resulting anchor has the name that has the underscore removed. Thus, [[#anchor_name]] with the underscore will fail.
Steps to reproduce
edit wiki page and put in an anchor w/the name containing an underscore <<Anchor(anchor_name)>>
also put an anchor link [[#anchor_name]].
- Save page.
the anchor link won't work because the anchor is now named "anchorname". [[#anchorname]] will work however.
Example
underscore
...
colon
...
(see discussion below)
Component selection
Anchor macro, wikiutil.anchor_name_from_text < does someone understand why it strips the _ ?
Details
this wiki
MoinMoin Version |
1.7.x, 1.8.0 |
OS and Version |
|
Python Version |
|
Server Setup |
|
Server Details |
|
Language you are using the wiki in (set in the browser/UserPreferences) |
|
Workaround
don't use anchor names with underscore, or remove underscore when creating the link.
Discussion
Relevant Specs:
http://www.w3.org/TR/html4/types.html#h-6.2 - must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
Other Specs:
http://dev.w3.org/html5/spec/Overview.html#the-id-attribute - at least one char, no spaces (no whitespace)
http://www.w3.org/TR/xhtml1/#guidelines - strings matching the pattern [A-Za-z][A-Za-z0-9:_.-]* should be used
Conclusion:
- [A-Za-z][A-Za-z0-9:_.-]* matches all specs and recommendations
http://hg.moinmo.in/moin/1.8/rev/a7dc3cc36362 was too restrictive
(22:56) < TheSheep> as long as _ are not used in default themes for stylable element IDs, I think it's fine (23:04) < ThomasWal> we can avoid that by convention, I think (23:05) < TheSheep> exactly
Proposed code
def anchor_name_from_text(text): ''' Generate an anchor name from the given text This function generates valid HTML IDs matching: [A-Za-z][A-Za-z0-9:_.-]* ''' quoted = urllib.quote_plus(text.encode('utf-7')) res = quoted.replace('%', '.').replace('+', '_') if not res[:1].isalpha(): return 'A%s' % res return res
This code is a revert of changeset http://hg.moinmo.in/moin/1.8/rev/a7dc3cc36362.
Further ideas
If we run anchor IDs (link targets) as well as the fragment part of links through the same transformation, it will ever match.
If the transformation is an identity transform for valid IDs, there should be no problem.
The current code is almost an identity for valid IDs, except for ":" which is transformed to ".3A":
> assert u'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789:_.-' == u'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789.3A_.-' (fails)
We could modify the function to keep ":" in IDs, then stuff like this could work:
[[#Übler:Dübel]] ... <<Anchor(Übler:Dübel)>>
Done by http://hg.moinmo.in/moin/1.8/rev/1f5b1e0423fb.
Plan
- Priority:
- Assigned to:
Status: fixed by http://hg.moinmo.in/moin/1.8/rev/30ffb215bf6e