Equivalences
Colors Meaning
Ignored tags in our converter, so even the child element will not be displayed |
Deprecated or discouraged tags |
Tags with question about the implementation |
Clear implementation description, nothing else to comment |
Not really converted, just use <span html-element="tag.name"> |
I did not check carefully which color I should give |
Array of equivalences
HTML Tag |
DOM Tree |
Comments |
<a href="uri"> |
<a xlink:href="uri"> |
We should not care about the URI, we cannot check if it is correct or not |
<abbr> |
<span html-element="tag.name"> |
It normally defines an abbreviation. |
<acronym> |
<span html-element="tag.name"> |
It normally defines an acronym. |
<address> |
<span html-element="tag.name"> |
|
<applet> |
Ignored |
|
<area> |
Ignored |
Unused |
<b> |
<strong> |
Discouraged |
<base /> |
Not converted |
Need to be kept in the memory to retrieve the link of the relative URLS |
<basefont /> |
Ignored |
Deprecated |
<bdo /> |
Ignored |
Normal BiDi algorithm should be enough. |
<big> |
<span font-size=120%> |
Discouraged |
<blockquote> |
??? |
|
<body> |
<body> |
|
<br /> |
<line-break> |
|
<button> |
Ignored |
Forms related. |
<caption> |
Ignored |
Forms related. |
<center> |
Ignored |
|
<cite> |
??? |
Quotation |
<code> |
<code> |
|
<col /> |
Define attributes for the table |
We should keep in memory such of values to handle correctly the attributes |
<colgroup> |
Similar to col |
We should keep in memory such of values to handle correctly the attributes |
<dd> |
<list-item-label> |
|
<del> |
<span text-decoration="line-through"> |
Stroke |
<dfn> |
<span html-element="tag.name"> |
Defines a definition term |
<dir> |
<list> |
HelpOnLanguages states that we have equivalent macro for similar purpose — <<en>> and so -- EugeneSyromyatnikov 2010-05-20 13:24:02 No, dir makes a list of directory. It is quite different. |
<div> |
|
Should be ignored ??. |
<dl> |
<list> |
Definition list is the default type for list. |
<dt> |
<list-item> |
|
<em> |
<emphasis> |
|
<fieldset> |
Ignored |
Forms related. |
<font> |
??? |
May be generate the same thing as span with appropriate style definitions? -- EugeneSyromyatnikov 2010-05-20 13:24:02 |
<form> |
Ignored |
Forms related. |
<frame /> |
Ignored |
We should refuse to accept a page with a frame tag ! We need to see if we convert all the frame, or just the main one. Frame tag exists only in HTML 4.0 FRAMESET DTD and already deprecated, so this tag can be trated as any other unknown tag. |
<frameset> |
Ignored |
|
<hX> |
<h outline-level="X"> |
the level should be define with : page:outline-level Should we check the correct level of the heading ? And fix it if possible ?! |
<head> |
Ignored |
|
<hr /> |
<separator> |
|
<html> |
|
We should consider all the tags inside of this one as HTML, so we should convert it. |
<i> |
<emphasis> |
Discouraged |
<iframe> |
Ignored |
Same than frame |
<img /> |
<object> |
Need to see for the mime-type |
<input /> |
Ignored |
Forms related. |
<ins> |
<span text-decoration="underline"> |
|
<isindex> |
Ignored |
|
<kbd> |
<span html-element="tag.name"> |
|
<label> |
Ignored |
Forms related. |
<legend> |
Ignored |
Forms related. |
<li> |
<list-item> |
|
<link /> |
Ignored |
|
<map> |
Ignored. |
|
<menu> |
Ignored. |
|
<meta /> |
METADATA |
Should be converted into metadata. |
<noframes> |
Ignored |
|
<noscript> |
Ignored. |
|
<object> |
<object> |
Should see what kind of object we have. If we can support it, we should add an item with the specific mimetype. |
<ol> |
<list item-label-generate="unordered"> |
This is an ordered list. We should also support the deprecated attribute type. A --> upper-aplha, I --> upper-roman |
<optgroup> |
Ignored |
Forms related. |
<option> |
Ignored |
Forms related. |
<p> |
<p> |
Symmetric tag |
<param /> |
Ignored |
|
<pre> |
<blockcode> |
Use <blockcode> if we want to have symmetric DOM-HTML converter. But probably not the best choice. |
<q> |
|
Short quote, see <cite> |
<s> |
<span text-decoration="line-through"> |
Deprecated |
<samp> |
<code> |
|
<script> |
Ignored. |
Not allowed in the body. |
<select> |
Ignored |
Forms related. |
<small> |
<span font-size="80%"> |
Deprecated |
<span> |
<span> |
|
<strike> |
<span text-decoration="line-through"> |
|
<strong> |
<strong> |
|
<style> |
Ignored. |
Not allowed in the body. |
<sub> |
<span base-line-shift="sub"> |
|
<sup> |
<span base-line-shift="super"> |
|
<table> |
<table> |
Attributes to consider : align bgcolor border cellpadding cellspacing And also all these parameters from the CSS too. |
<tbody> |
<table-body> |
Attributes to consider : align valign And also all these parameters from the CSS too. |
<td> |
<table-cell> |
Attributes to consider : align bgcolor colspan rowspan valign And also all these parameters from the CSS too. |
<textarea> |
Ignored |
Forms related. |
<tfoot> |
<table-footer> |
Attributes to consider : align valign And also all these parameters from the CSS too. |
<th> |
<table-cell> |
th is a table cell. However, it is a header, we should see to add this as an attribute. |
<thead> |
<table-header> |
Attributes to consider : align valign And also all these parameters from the CSS too. |
<title> |
Ignored |
See metadata, not allowed in the body. |
<tr> |
<table-row> |
Attributes to consider : align bgcolor valign And also all these parameters from the CSS too. |
<tt> |
<code> |
Not so correct, but we need it to preserve the round-trip conversion |
<u> |
<span text-decoration="underline"> |
Deprecated |
<ul> |
<list item-label-generate="ordered"> |
|
<var> |
Ignored |
|
List of tags with questions
<big> : Is there <span font-size="120%> ?
- Discouraged in HTML 4.01
<blockquote> : Is there any <quote> tag in the DOM Tree ?
<br /> : Is there <line-break /> ?
<cite> : Is there any <quote> tag in the DOM Tree ?
<font> : Can we define font attribute in span tag ?
- Deprecated in HTML 4.01. Ignore.
<del>
See <ins>
<s>, <strike> : Is there <span text-decoration="line-through"> ?
- Discouraged in HTML 4.01
<small> : Is there <span font-size="85%"> ?
- Discouraged in HTML 4.01
May be we should convert big/small/u/s and so on to divs/spans with additional classes? For example, <big> → <span class="tag-big">. The only pitfall is that without appropriate css such text can't be visually distinguished. On the other hand, this approach is rather generic and interoperable (e. g., can be used, for example, in wiki converters and so on). -- EugeneSyromyatnikov 2010-06-07 21:34:52
Some questions
- How should we handle the forms related tags ? We can just ignore them, or just put the text from the caption in the tree ?
- How to handle external object ? Like image, but also video, music or whatever ... Should we copy the resource on the server ? Should we just keep the URI, and so add it as a new item in our tree ? Maybe the best would be to let the administrator choose.
The converter is for converting docbook documents that are located in a moinmoin wiki. So if there are external URLs, they likely should get handled as is. If someone wants to import existing docbook documents and convert the external resource references to refer to local items (and create those items), that would maybe be a task for a docbook document importer rather. But then, it might be a bit difficult to decide what should get converted to local items and what not. -- ThomasWaldmann 2010-05-15 13:03:44