{i} Current State : WIP

DocBook to DOM Conversion

Equivalences

Check DOM DocBook and HTML 2010/DocBook-DOM Equivalences

About this converter

The idea is to display all the information, with some basic style to be able to make difference between the different elements. But we will naturally loose information when we will convert from DocBook to the DOM Tree since the last one cannot have any semantic meaning.

Exceptions

At this time the converter handle correctly two kind of error in your DocBook document :

Invalid XML Document
If the xml parser could not parse correctly your document you will see this corresponding error. Usually this happen whet you forget to close a tag, or you did a typo mistake in the name of a tag.
NameSpaceError

If you did not put the correct namespace declaration (xmlns='http://docbook.org/ns/docbook') for your DocBook document you will get this error. Usually this happen when you forget to put xmlns='http://docbook.org/ns/docbook' in the attribute of the root element. Or if you did a mistake in the address of the namespace.

At this time the DocBook_IN converter is not so rigorous with the DocBook specification, you won't get any indication that your document is valid according to the DocBook document, even if the converter could convert it into our internal DOM Tree. However for a better result, the converter expect valid DocBook 5 document.

Status

{i} Current State : WIP (Early stage)

Page Structure

Sections

The section in DocBook defines part between headings. There is two system for the sections : recursive and numbered section. First one only use <section> tag and let the processor deal with the level. And the second one explicitly define a level from 1 to 5 with tag like <sectN>.

The two system are mutually exclusive, so you you cannot have numbered section inside recursive one, and vice et versa. However, the two system can be used in the same document if the previous rule has been respected.

Lists

Here is a DocBook file with the example of each list : List.xml

And here is the pdf resulting after a conversion using two classical tool : xslt and fop : List.pdf

QandA set

Tables

There is two kind of table in DocBook : db.html.table and db.cals.table.

Html.Table

Since db.html.table are same that the usual table in HTML, the code, test and equivalence are globally similar to the code from the HTML_IN converter.

db.cals.table

The converter does not support rowspan and colspan with db.cals.table.

Misc

Actually there is three different kind of links in the DocBook reference. However all use xlink namespace for their attribute except for the linkend attribute which define a link within the document.

So it is pretty simple to convert these link into the DOM tree, since it is also using xlink namespace for the link.

However, I propose also to handle conversion of the old way for the link from DocBook 4.X even if the converter does not support directly DocBook 4.X. Indeed, many application still use <ulink url=url> for the links. Especially the Moin1.X DocBook formatter output link like this. So it can be interesting to keep that.

/!\ endterm attribute is not supported for any of this elements.

Object

Code

Style Elements

Admonitions

Standard Attribute

Ignored tags

The following tags are completely ignored by the converter, so even the children of these elements will not be handled.

['abstract', 'artpagenums', 'annotation', 'artpagenums', 'author', 'authorgroup',
'authorinitials', 'bibliocoverage', 'biblioid','bibliomisc', 'bibliomset', 'bibliorelation',
'biblioset', 'bibliosource', 'collab', 'confdates', 'confgroup', 'confnum', 'confsponsor',
'conftitle', 'contractnum', 'contractsponsor', 'contrib', 'copyright',
'cover', 'edition', 'editor','extendedlink', 'issuenum', 'itermset', 'keyword',
'keywordset', 'legalnotice', 'org', 'orgname', 'orgdiv', 'otheraddr', 'othercredit', 'pagenums', 'personblurb', 'printhistory',
'productname', 'productnumber', 'pubdate','publisher', 'publishername', 'releaseinfo', 'titleabbrev',
'revhistory', 'seriesvolnums','subjectset', 'volumenum', 'bibliodiv', 'biblioentry', 'bibliography',
'bibliolist', 'bibliomixed', 'biblioref', 'bibliorelation','citation', 'callout', 'calloutlist','co', 'imageobjectco', 'area',
'areaset','areaspec', 'classname', 'classsynopsis', 'classsynopsisinfo', 'constructorsynopsis', 'fieldsynopsis',
'funcdef', 'funcparams', 'funcprototype', 'funcsynopsis', 'funcsynopsisinfo', 'function', 'group', 'initializer'
'interfacename', 'methodname', 'methodparam', 'methodsynopsis', 'ooclass', 'ooexception', 'oointerface', 'varargs', 'void', 
'guibutton', 'guiicon', 'guilabel', 'guimenu', 'guimenuitem', 'guisubmenu',
'info', 'bridghead', 'constraint', 'constraintdef', 'lhs', 'nonterminal', 'rhs',
'msg, 'msgaud', 'msgentry', 'msgexplan', 'msginfo', 'msglevel', 'msgmain', 'msgorig', 'msgrel', 'msgset', 'msgsub', 'msgtext',
'refclass', 'refdescriptor', 'refentry', 'refentrytitle', 'reference', 'refmeta', 'refmiscinfo', 'refname', 'refnamediv',
'refpurpose', 'refsect1', 'refsect2', 'refsect3', 'refsection', 'refsynopsisdiv',
'toc', 'tocdiv', 'tocentry', 'arc', 'spanspec', 'xref',
'index', 'indexdiv', 'indexentry', 'indexterm',
'primary', 'primaryie', 'secondary', 'secondaryie', 'see', 'seealso',
'tertiary', 'tertiaryie' ]

Actually, the ignored tags are mainly the "info" and the bibliography elements. For the info elements the DocBook documentation indicates the following :

Processing expectations

Suppressed. Many of the elements in this wrapper may be used in presentation, but they are not generally printed as part of the formatting of the wrapper. The wrapper merely serves to identify where they occur.

So we are not processing the "info" elements. Later, we can imagine a metadata processor for MoinMoin, which would extract such of data.

For the bibliography, we do not support it either, since it would be useful only if we could support a full environment to handle bibliography. We can also imagine that bibliography support can be add to MoinMoin later.

Inline Elements not handled

/!\ WIP /!\

The following list of elements are just handle using a <span element="element-name">.

Block Element not handled in DOM Tree

Some elements does not have direct equivalence in our DOM Tree, but to keep the meaning we convert the following tags using <div class="db_tag.name">

It also check if there is a title as a first child element, if so, we add this to the html:title attribute of the <div> element.

/!\ Informal*

/!\ InlineEquation

ToDo

MoinMoin: DOM DocBook and HTML 2010/DocBook-DOM (last edited 2010-08-10 19:25:46 by ValentinJaniaut)