Current State : WIP
DocBook to DOM Conversion
Contents
Equivalences
Check DOM DocBook and HTML 2010/DocBook-DOM Equivalences
About this converter
The idea is to display all the information, with some basic style to be able to make difference between the different elements. But we will naturally loose information when we will convert from DocBook to the DOM Tree since the last one cannot have any semantic meaning.
Exceptions
At this time the converter handle correctly two kind of error in your DocBook document :
- Invalid XML Document
- If the xml parser could not parse correctly your document you will see this corresponding error. Usually this happen whet you forget to close a tag, or you did a typo mistake in the name of a tag.
- NameSpaceError
If you did not put the correct namespace declaration (xmlns='http://docbook.org/ns/docbook') for your DocBook document you will get this error. Usually this happen when you forget to put xmlns='http://docbook.org/ns/docbook' in the attribute of the root element. Or if you did a mistake in the address of the namespace.
At this time the DocBook_IN converter is not so rigorous with the DocBook specification, you won't get any indication that your document is valid according to the DocBook document, even if the converter could convert it into our internal DOM Tree. However for a better result, the converter expect valid DocBook 5 document.
Status
Current State : WIP (Early stage)
Page Structure
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<article>
<page> or <div>
Done
Done
Need to check if <page> can contain other <page>
<simpara>
<p>
Done
Done
<para>
<p>
Done
Done
<formalpara>
<p title=>
Done
Done
Sections
The section in DocBook defines part between headings. There is two system for the sections : recursive and numbered section. First one only use <section> tag and let the processor deal with the level. And the second one explicitly define a level from 1 to 5 with tag like <sectN>.
The two system are mutually exclusive, so you you cannot have numbered section inside recursive one, and vice et versa. However, the two system can be used in the same document if the previous rule has been respected.
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<sectX>
<h outline-level=X> if <title>
Done
Done
<section>
<h outline-level=X> if <title>
Done
Done
Lists
Here is a DocBook file with the example of each list : List.xml
And here is the pdf resulting after a conversion using two classical tool : xslt and fop : List.pdf
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<itemizedlist>
<list item-label-generate='unordered'>
Done
Done
<orderedlist>
<list item-label-generate='ordered'>
Done
Done
<variablelist>
<list>
Done
Done
<varlistentry>
<list-item>
Done
Done
<term>
<list-item-label>
Done
Done
<glosslist>
<list>
Done
Done
<glossentry>
<list-item>
Done
Done
<glossterm>
<list-item-label>
Done
Done
<glossdef>
<list-item-body>
Done
Done
<procedure>
<list item-label-generate='ordered'>
Done
Done
The official XSL stylesheet add a Procedure title, but we don't
<step>
<list-item><list-item-body>
Done
Done
<stepalternatives>
<list-item><list-item-body>
Done
Done
<substep>
<list item-label-generate='ordered'>
Done
Done
<qandaset>
<list>
Done
Done
See QandAset.
<question>
quite complex, need translation
Done
Done
See QandAset.
<answer>
quite complex, need translation
Done
Done
See QandAset.
<segmentedlist>
<list>
Done
Done
Create variable list with pre-defined label.
<segtitle>
Just save it
Done
Done
<seglistitem>
<list-item>
Done
Done
<seg>
Saved label+<list-item-body>
Done
Done
<simplelist>
<list item-label-generate='unordered'>
WIP
WIP
Do not support type attribute yet. It also use bullet for the rendering.
<member>
<list-item><list-item-body>
Done
Done
<listitem>
<list-item><list-item-body>
WIP
WIP
Different conversion depending the parent list.
QandA set
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<quandadiv>
<list>
Nothing
Nothing
`defaultlabel='number'
<list item-label-generate='unordered'>
Done
Done
<question><answer>
<list-item><list-item-body><p>Q</p><p>A</p></></>
Done
Done
`defaultlabel='qanda'
<list>
Done
Done
<question>
<list-item><list-item-label>Q:</><list-item-body>Q Body</></>
Done
Done
<answer>
<list-item><list-item-label>A:</><list-item-body>A Body</></>
Done
Done
Tables
There is two kind of table in DocBook : db.html.table and db.cals.table.
Html.Table
Since db.html.table are same that the usual table in HTML, the code, test and equivalence are globally similar to the code from the HTML_IN converter.
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<informtable>
<table>
Done
WIP
see title
<table>
<table>
Done
WIP
see title
<theader>
<table-header>
Done
Done
<tfoot>
<table-footer>
Done
Done
<tbody>
<table-body>
Done
Done
<tr>
<table-row>
Done
Done
<td>
<table-cell>
Done
Done
<th>
<table-cell>
Done
Done
<col>
Save attribute to put it on the col
Nothing
Nothing
<colsepc>
Save attribute to put it on the col
Nothing
Nothing
db.cals.table
The converter does not support rowspan and colspan with db.cals.table.
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<informtable>
<table>
Done
WIP
see title
<table>
<table>
Done
WIP
see title
<row>
<table-row>
Done
Done
<entry>
<table-cell>
Done
Done
<entrytbl>
<table-cell><table>
Done
Done
<colgroup>
Save attribute to put it on the col
Nothing
Nothing
Misc
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<footnote>
<note note-class="footnote">
Done
Done
<quote>
<quote>
Done
Done
<blockquote>
<blockquote>
Done
Done
attribution element converted to source.
<attribution>
source attribute of blockquote
Done
Done
See blockquote below
<trademark>
<span element=trademark>
Done
Done
add a trademark at the end
<sbr>
<line-break>
Done
Done
<email>
Add the corresponding macro in the DOM
Nothing
Nothing
<tag namespace class>
<span class="db-tag-class">{namespace}tag
Done
Done
Links
Actually there is three different kind of links in the DocBook reference. However all use xlink namespace for their attribute except for the linkend attribute which define a link within the document.
So it is pretty simple to convert these link into the DOM tree, since it is also using xlink namespace for the link.
However, I propose also to handle conversion of the old way for the link from DocBook 4.X even if the converter does not support directly DocBook 4.X. Indeed, many application still use <ulink url=url> for the links. Especially the Moin1.X DocBook formatter output link like this. So it can be interesting to keep that.
endterm attribute is not supported for any of this elements.
Object
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<inlinemediadata>
<span element="inlinemediadata">
Done
Done
<mediadata>
<div html:class="mediadata">
Done
Done
<audioobject>
See *Object conversion
Done
Done
<imageobject>
See *Object conversion
Done
Done
<textobject>
See *Object conversion
Done
Done
<videoobject>
See *Object conversion
Done
Done
<imagedata>
<object type='image/'>
Done
Done
<audiodata>
<object type='audio/'>
Done
Done
<videodata>
<object type='video/'>
Done
Done
Code
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<screen>
<blockcode>
Done
Done
Need to see if the DOM Tree support linenumbering, language
<programlisting>
<blockcode>
Done
Done
Need to see if the DOM Tree support linenumbering, language
<literal>
<code>
Done
Done
<literallayout>
<blockcode html:class="db-literallayout">
Done
Done
<code>
<code>
Done
Done
Check language attribute
<computeroutput>
<code>
Done
Done
<markup>
<code>
Done
Done
Style Elements
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<emphasis>
<emphasis>
Done
Done
<emphasis role="strong">
<strong>
Done
Done
<phrase>
<span>
Done
Done
<subscript>
<span baseline-shift>
Done
Done
<superscript>
<span baseline-shift>
Done
Done
Admonitions
DocBook element
Moin_page equivalence
Test
Conversion
Comments
<caution>
<admonition type='caution'>
Done
Done
<important>
<admonition type='important'>
Done
Done
<note>
<admonition type='note'>
Done
Done
<tip>
<admonition type='tip'>
Done
Done
<warning>
<admonition type='warning'>
Done
Done
Standard Attribute
- xml:base
Ignored tags
The following tags are completely ignored by the converter, so even the children of these elements will not be handled.
['abstract', 'artpagenums', 'annotation', 'artpagenums', 'author', 'authorgroup', 'authorinitials', 'bibliocoverage', 'biblioid','bibliomisc', 'bibliomset', 'bibliorelation', 'biblioset', 'bibliosource', 'collab', 'confdates', 'confgroup', 'confnum', 'confsponsor', 'conftitle', 'contractnum', 'contractsponsor', 'contrib', 'copyright', 'cover', 'edition', 'editor','extendedlink', 'issuenum', 'itermset', 'keyword', 'keywordset', 'legalnotice', 'org', 'orgname', 'orgdiv', 'otheraddr', 'othercredit', 'pagenums', 'personblurb', 'printhistory', 'productname', 'productnumber', 'pubdate','publisher', 'publishername', 'releaseinfo', 'titleabbrev', 'revhistory', 'seriesvolnums','subjectset', 'volumenum', 'bibliodiv', 'biblioentry', 'bibliography', 'bibliolist', 'bibliomixed', 'biblioref', 'bibliorelation','citation', 'callout', 'calloutlist','co', 'imageobjectco', 'area', 'areaset','areaspec', 'classname', 'classsynopsis', 'classsynopsisinfo', 'constructorsynopsis', 'fieldsynopsis', 'funcdef', 'funcparams', 'funcprototype', 'funcsynopsis', 'funcsynopsisinfo', 'function', 'group', 'initializer' 'interfacename', 'methodname', 'methodparam', 'methodsynopsis', 'ooclass', 'ooexception', 'oointerface', 'varargs', 'void', 'guibutton', 'guiicon', 'guilabel', 'guimenu', 'guimenuitem', 'guisubmenu', 'info', 'bridghead', 'constraint', 'constraintdef', 'lhs', 'nonterminal', 'rhs', 'msg, 'msgaud', 'msgentry', 'msgexplan', 'msginfo', 'msglevel', 'msgmain', 'msgorig', 'msgrel', 'msgset', 'msgsub', 'msgtext', 'refclass', 'refdescriptor', 'refentry', 'refentrytitle', 'reference', 'refmeta', 'refmiscinfo', 'refname', 'refnamediv', 'refpurpose', 'refsect1', 'refsect2', 'refsect3', 'refsection', 'refsynopsisdiv', 'toc', 'tocdiv', 'tocentry', 'arc', 'spanspec', 'xref', 'index', 'indexdiv', 'indexentry', 'indexterm', 'primary', 'primaryie', 'secondary', 'secondaryie', 'see', 'seealso', 'tertiary', 'tertiaryie' ]
Actually, the ignored tags are mainly the "info" and the bibliography elements. For the info elements the DocBook documentation indicates the following :
Processing expectations Suppressed. Many of the elements in this wrapper may be used in presentation, but they are not generally printed as part of the formatting of the wrapper. The wrapper merely serves to identify where they occur.
So we are not processing the "info" elements. Later, we can imagine a metadata processor for MoinMoin, which would extract such of data.
For the bibliography, we do not support it either, since it would be useful only if we could support a full environment to handle bibliography. We can also imagine that bibliography support can be add to MoinMoin later.
Inline Elements not handled
WIP
The following list of elements are just handle using a <span element="element-name">.
- abbrev
- address
- accel
- acronym
- affiliation
- alt
- anchor
- city
- command
- constant
- country
- database
- date
- errorcode
- errorname
- errortext
- errortype
- exceptionname
- fax
- filename
- firstname
- firstterm
- foreignphrase
- hardware
- holder
- honorific
- jobtitle
- keycap
- keycode
- keycombo
- keysym
- lineannotation
- manvolnum
- mousebutton
- option
- optional
- package
- person
- personname
- phone
- pob
- postcode
- prompt
- remark
- replaceable
- returnvalue
- shortaffil
- shortcut
- state
- street
- surname
- symbol
- systemitem
- termdef
- type
- uri
- userinput
- varname
- wordasword
Block Element not handled in DOM Tree
Some elements does not have direct equivalence in our DOM Tree, but to keep the meaning we convert the following tags using <div class="db_tag.name">
It also check if there is a title as a first child element, if so, we add this to the html:title attribute of the <div> element.
- acknowledgements
- appendix
- caption
- chapter
- cmdsynopsis
- colophon
- dedication
- epigraph
- example
- figure
- equation
- part
- partintro
- screenshoot
- set
- setindex
- sidebar
- simplesect
- subtitle
- synopsis
- synopfragment
- task
- taskprerequisites
- taskrelated
- tasksummary
title or <h> if child of a section.
Informal*