Contents
Description
Have Moin generate pages which are valid XHTML documents
Pages generated by Moin are (mostly) valid HTML 4.01, but not XHTML 1.0 strict (nor upper revisions of XHTML like 1.1).
Note: Even if it seems to be a well known problem (it is both mentioned as an issue in the changelog for moin version 1.2, dated 2004-02-20, and discussed about in DeronMeranda/ChallengesWithXHTML), no feature request exists for this. That's why I'm adding this page.
I find this feature quite important for a couple of reason:
my primary interest is in writing moin macros which will have as output various XML dialect (though mainly MathML), as long as moin does not output valid XML documents it is pointless to have them
- having moin output valid XML documents will enable using moin not only as a final component of web based architecture, but also as an intermediate component which output will be post processed by other applications able to process XML documents
In DeronMeranda/ChallengesWithXHTML it is argumented that the proper solution for easily outputting XHTML would be to switch to an internal tree based representation for pages. Even if I agree with that, I found that XHTML output could be achieved even without that big architecture change, at least for legacy parsers. It is just a matter of tune them so that tags are output balanced.
I'm (StefanoZacchiroli) willing to work on that producing a patch. Still, before doing so, I would like to know if such a patch would be interesting for Moin developers and if it can clash with some development directions for moin 1.6.
Patches
Here I (StefanoZacchiroli) will collect the work in progress patches I'm producing. I'm not considering adding patches under MoinMoinPatch until a decent result (like valid XML output) has been reached. If Moin developers want to review/commit my patches before that it would of course be great!
Discussion
For development, you have to use moin/1.6 branch and merge often or it will likely be outdated when you are finished. Of course we would like to have xhtml output, but this isn't just adding "/>" everywhere, but a major effort is needed to do that (including tests for saving it stays like that once reached). We maybe need this internal tree stuff for other reasons, too, but as we currently have about 4 SOC projects running, this hasn't begun yet. If you want to join us and do long term work on that topic, you're very welcome! BTW, there is absolutely NO point in 99.9% xhtml. -- ThomasWaldmann 2006-06-14 15:07:16
Thats right. And that most people nowadays think so, resulted in the HTML 5 specification as it is today. For valid html documents we should put all our efforts in this. I made a Feature Request for it - FeatureRequests/SkipHTML4OrXHTMLValidationGoForHTML5 - and want to give my -1 for this one
Right now I don't have enough knowledge of moin internals (I just used to write some parsers and some formatters) to be willing to take over the duty of moving to using an internal tree structure. Is that what you're asking for? If it is not, would it be useful to working on the dirty work of closing tags, changing doctype, ... as recently added in DeronMeranda/ChallengesWithXHTML. Personally I think it would be useful, unless moving to the tree structure is the next change on the TODO list of you core developers. What do you think? -- StefanoZacchiroli 2006-06-14 15:27:12
I don't think current core developers have time to work on internal tree stuff before next autumn/winter. This summer, it is mainly SOC projects (backend/mimetype, sync, search stuff), maybe some integration work afterwards and working on releasable code of everything that has been done. -- ThomasWaldmann 2006-06-14 15:59:34
BTW, the current Moin is not even HTML 4.01 compliant, despite what it says at the bottom of the page! Anyway, I think there's a lot of simple yet tedious work that could be done if somebody has time. Especially all the simple things which are low risk and won't break anything--such as fixing the "/>" self-closers, HTML-escaping all inline javascript fragments, etc. It may not have any immediate value, but it nonetheless is very tedious work that should eventually get done anyway, so doing it during a relatively calm period makes sense if somebody's willing to expend the effort. I think it really comes down to how much work you take on yourself (such as lots of testing and keeping up with the trunk branch), and if any of the core developers would be willing to take the time to apply your patches. However, the more I look at this the more I come to the realization that XHTML is going to be much harder without some new internal tree-like structures; and I expect that will be quite a large change if/when it occurs. I think this could proceed in phases though: first just output well-formed XML, then valid/standard HTML (getting javascript code to use standard DOM interface for example), and then tackle XHTML. Just the well-formed XML stage could be benificial to some users, even if it's not yet good enough to allow mixed XHTML+MathML+SVG documents. Even with well-formed XML, it may be possibile finally to write some new automated test hooks, like xmllint. -- -- DeronMeranda 2006-06-14 17:46:30
I started the grunt work toward the first milestone of output well-formed XML. The status of my work is represented by the patches above. Do you think I should also open a sub page of MoinMoinPatch (maybe with just a link to here)? -- StefanoZacchiroli 2006-06-16 13:31:59