Details
- Applies to
1.5.0 (MoinMoin/formatter/ sources)
- Purpose
- Improve formatter methods to allow arbitrary HTML/CSS attributes to be passed consistently
- Description
The main purpose of this patch is to clean up the internal APIs for formatters, while maintaining as much backwards compatibility as possible. The cleanup provides a more natural and consistent calling convention for all the different formatting methods. It also especially tries to get the HTML formatter up to a more standard-compliant output. This fix will also enable a lot of other CSS and styling features and enhancements.
See also the description on FeatureRequests/FormatterApiConsistencyForHtmlAttributes
Patch
formatter-patch-r4.diff (rev 4) (committed to moin--refactor--1.5--patch-3)
The bulk of the changes are specific to the text_html.py formatter. Changes in the other files mostly consist of just allowing optional keyword arguments to be passed to the methods (which are mostly ignored).
The HTML formatter is now capable of generating XHTML compliant, or at least well-formed XML, output---for those parts of the document that the formatter generates. However this is still not enough to make MoinMoin produce correct XHTML, but it's a step in the right direction. This patch allows all elements (tags) to have almost any standard attribute added to them, and enables this with consistent method calling conventions.
The docstrings to many of the methods have been improved.
This also incorporates a fix for MoinMoinBugs/StrikeElementNotStrict
Older patches. These are for historic reference only, do not use.
Rev 2: formatter-patch.diff
Rev 1: (as separate diffs) base.py.diff dom_xml.py.diff text_gedit.py.diff text_html.py.diff text_plain.py.diff text_xml.py.diff xml_docbook.py.diff
For macro and extension writers
If you write or maintain any macros or other extensions which make calls to the formatter directly, you should familiarize yourself with the new calling conventions. Most formatter methods now take and number of additional arbitrary keyword-arguments. This can be useful for setting CSS styles or other things.
You may want to start by reading the docstring for the function rewrite_attribute_name() in the file MoinMoin/formatter/text_html.py.
Also note that this patch has some preliminary support for properly indenting HTML code according to its structure, which can make debugging easier. But this is not perfect yet, primarily because of complexities with how the page caching works.
Discussion
Thanks for putting lots of effort in improving the formatter. I am considering applying your patches after some changes:
base.py.diff has bugs in image function. Either do something working or raise NotImplementedError.
dom_xml.py.diff: def number_list(self, on, type=None, start=None, **k2) and image func looks strange, too.
- text_html.py.diff:
please remove all XMLisms like " />" "xml:" etc. - we are still at html 4.01 and even that isn't completely valid. We currently have no chance of producing valid xml so it shouldn't look like either.
- formatAttributes needs refactoring.
if len(kw)>0: short: if kw: ?
I uploaded a new version of this patch (rev 2). This includes fixing the small bugs, most of the major refactoring work, lots of coding convention PEP8 fixes, added even more complete docstrings and comments. It does not address yet all your issues. I thought it was important to get the big changes done first. I've tested with IE and Firefox, Windows and Linux, and I think these changes are pretty sound now.
The refactoring of the formatAttributes method should make the code much clearer, but you can judge that. There are two new helper functions at the top of the file which offload a lot of the special-cases.
Also I didn't react to your XML-isms complaint yet. I think it needs more thought; we're actually much closer to clean HTML than you may think. I at least want to allow the formatter to produce well-formed XML (as well as allow plugin/macro writers to be able to produce clean XHTML even if moin does not). Check what I've done with this revision, and we can argue the other bit later. -- DeronMeranda 2006-01-16 20:30:31
It's true that ultimately XHTML has many advantages over HTML. But remmebr that: 1) XHTML is not valid HTML, the fact that some browsers render it almost properly as "tag soup" means nothing, 2) to exploit the benefits of XML you have to serve the files with proper MIME type, which will make "modern" browser just ask where to save it, 3) to do it right you should generate it as a tree first and then convert it into XML. I think that we just need a completely new formatter application_xhtml_xml.py. -- RadomirDopieralski 2006-01-16 22:26:13
- This patch is not trying to make moin output XHTML. But the little bit of XML-isms it contains: 1) no longer blatently break XHTML for little reason, 2) gives extension/macro authors the power now to output correct well-formed html if they want to, 3) made it easier to eliminate some previous tag mis-nestings. Furthermore the small bits of xml syntax I added are browser safe, even for really old ones. Also, from my experience having correct xhtml goes a long way to solving a lot of otherwise-weird CSS problems that can happen with quirks-mode html. I should mention that I am, in fact, already running a customized moinmoin which is producing correct strict XHTML output, except for the gui editor part which I haven't fixed yet. That includes serving up with "application/xhtml+xml" mime type (which Firefox handles natively, but not IE yet); using CDATA-sections to hold javascript code, etc. The patches here are a big and important part of that; although obviously I had to make several patches to other files (mostly theme stuff) which are outside the scope of this patch. My point is that moinmoin isn't as far away as you may think.
As far as this patch, I kept the XML-isms very small and safe; but took advantage of this restructuring to at least do it right. Yes, producing XHTML would perhaps be easier going through an intermediate DOM structure; but that's not part of 1.5 and would involve major architectural changes. I believe that's being investigated for 2.0+ Anyway this patch is probably not quite complete yet--I'm sure to get more feedback and I still want to test it harder. -- DeronMeranda 2006-01-17 06:43:44
- This patch is not trying to make moin output XHTML. But the little bit of XML-isms it contains: 1) no longer blatently break XHTML for little reason, 2) gives extension/macro authors the power now to output correct well-formed html if they want to, 3) made it easier to eliminate some previous tag mis-nestings. Furthermore the small bits of xml syntax I added are browser safe, even for really old ones. Also, from my experience having correct xhtml goes a long way to solving a lot of otherwise-weird CSS problems that can happen with quirks-mode html. I should mention that I am, in fact, already running a customized moinmoin which is producing correct strict XHTML output, except for the gui editor part which I haven't fixed yet. That includes serving up with "application/xhtml+xml" mime type (which Firefox handles natively, but not IE yet); using CDATA-sections to hold javascript code, etc. The patches here are a big and important part of that; although obviously I had to make several patches to other files (mostly theme stuff) which are outside the scope of this patch. My point is that moinmoin isn't as far away as you may think.
OK, that sounds quite promising. I was a bit conservative because the last 2-3 tries for xhtml compliance only solved the small things (like " />" everywhere), but not the bigger problems (like correct list nesting). And 95% correct xhtml is practically as good as no xhtml at all. You did try some nested lists? See also that open converter bug MoinMoinBugs/IllegalListElementp relating to <p> directly within <li>. Maybe meet us on the MoinMoinChat for discussion inclusion of your patch before/after 1.5.1. -- ThomasWaldmann 2006-01-17 08:26:31
Yes, I tested lots of nested lists, and header levels, and so on, and it's okay. The correct nesting of tags is one of the primary side-effects of this patch. I also looked at the broken example pages on the MoinMoinBugs/IllegalListElementp as you suggested, and the page is getting rendered as correctly nested tags now (passes xmllint), unlike before this patch when it was a mess. But the GUI editor still fails on it, so there's more to that bug (this patch doesn't even touch the GUI editor portion). I want to get this patch very solid and safe first (I probably want at least another day for my own testing). BTW, does the new patch's refactoring look better to you? -- DeronMeranda 2006-01-17 16:23:10
Some more comments while looking at the diff:
- shouldn't that **kw get passed on to lower level functions instead of get thrown away mostly?
repair_attribute -> unquote_attribute (repair sounds negative somehow )
key = (ns, name) - you don't NEED those brackets (same for return x, y). If you used intentionally, they can stay, just be aware of it.
When you are at it anyway: please fix MoinMoinBugs/StrikeElementNotStrict also (and close it), so we don't have conflicts.
Okay. For now I'll probably just convert to <span>s with an inline style so this patch doesn't have to touch any CSS files (unless you want me to go ahead and do that now). But I'll also put a class on it too, so a stylesheet could override. I should have a final revision ready today or tomorrow. I think we already use span class="u" somewhere, so maybe do it the same way for strike.
- Yes, for underline. I'll go ahead an add new class defs to the stylesheets then and use classes only instead of inline styles. I'll use class name "strike" rather than just "s", so there's less a chance for conflict with existing wikis.
I've addressed most of your recent issues. Except when you said to pass **kw through to lower-level functions, I thought I was doing that everywhere it made sense, and I'm just not seeing the omission...please give me a hint.
One other positive thing is that I may be making accidental progress on the GUI editing bug with nested lists. Even though my previous revision produced correctly nested tags, my working-copy updated patch now also fixes a lot of the indenting and extraneous newlines in the html source. Structurally its the same document, but it seems to make the GUI editor much happier. But even if this works the GUI editor code must be unnecessarily fragile. -- DeronMeranda 2006-01-18 21:04:56
Spoke too soon. This doesn't fix the GUI editing bug. -- DeronMeranda 2006-01-19 20:34:43
New patch revision is out (rev 4). I'm hoping this is near final. If you can test, please do so and give feedback. -- DeronMeranda 2006-01-19 20:34:43
Please read arch@arch.thinkmo.de--2003-archives/moin--refactor--1.5--patch-4 for the PEP8 changes I did.
Plan
- Priority: medium
Assigned to: ThomasWaldmann
- Status: committed to moin--main (ca. 1.5.x)