DOM->Moinwiki converter implementing
First version will not have attributes controller and footnote support.
DFS algorithm will use two stacks: for opened nodes and for their children.
Two types of actions: When visiting node first time: open_<namespace>_<name>(node) When all children are visited: close_<namespace>_<name>(node)
Example:
1 class moinwiki:
2 moinwiki.emphasis = "''"
3
4 class Converter:
5 ...
6
7 def open_moinpage_emphasis(self, node):
8 if not node.children:
9 return moinwiki.emphasis + self.close_moinpage_emphasis(node)
10 else:
11 self.children.append(list(node.children))
12 self.opened_nodes.append(node)
13 return moinwiki.emphasis
14
15 def close_moinpage_emphasis(self, node):
16 return moinwiki.emphasis
I think i can done with this today at night.
If you see some limitations of this approach you are welcome to leave a message.
Sorry for just leaving a nitpicking note, but please remember PEP8.
Still working on DOM->Moinwiki
- added conversion of list
- TODO defenition list_type
TODO list level, for correct shift ' '*level before item
'lower-alpha'|'lower-roman' list_style_type not found in moinwiki_in Next step:
- conversion of table
DOM->Moinwiki converter: tables
Done:
- conversion of tables with attributes
table class and style attributes <tableclass="..." tablestyle="..." ...> in first table's cell
row class and style attributes <rowclass="..." rowstyle="..." ...> in first row's cell
cell class and style attributes <class="..." style="..." ...>
colspan: || * number_columns_spanned, (alternative way is <colspan=%number_columns_spanned%>)
rowspan: <-%number_rows_spanned%>, (alternative way is <rowspan=%number_rows_spanned%>)
- TODO: create separate class for conversion of tables (and for lists too)
- nodes with only text children don't need "close" function
Next Steps:
- new node types:
- underline
- superscript
- subscript
- smaller
- larger
- stroke
- moin_page.object
Span and first test
Done with moin_page.span, it contains next elements of moinwiki syntax:
- underline
- superscript
- subscript
- smaller
- larger
- stroke
But when i started to test, i've found some errors.) so i'm fixing them
We have passed the first barrier
Hmm, maybe some changes to '\n' would be later.
Text test
Input tree:
<page:page> <page:body> <page:h page:outline-level="3">Text:</page:h> \n <page:strong>strong</page:strong> \n <page:emphasis>emphasis</page:emphasis> \n <page:blockcode>blockcode</page:blockcode> \n <page:code>monospace</page:code> </page:body> </page:page>
Output:
=== Text: ===\n'''strong'''\n''emphasis''\n{{{blockcode}}}\n`monospace`
Text:
strong emphasis blockcode monospace
Table test
Input tree:
<page:page> <page:body> <page:h page:outline-level="3">Table:</page:h> \n <page:table> <page:table-body> <page:table-row> <page:table-cell>A</page:table-cell> <page:table-cell>B</page:table-cell> <page:ta ble-cell page:number-rows-spanned="2">D</page:table-cell> </page:table-row> <page:table-row> <page:table-cell page:number-columns-spanned="2">C</page:table-cell> </page:table-row> </page:table-body> </page:table> </page:body> </page:page>
Output:
=== Table: ===\n||A||B||<|2>D||\n||||C||\n\n
Table:
A |
B |
D |
C |
List test
Input tree:
<page:page> <page:body> <page:h page:outline-level="3">List:</page:h> \n <page:list page:item-label-generate="unordered"> <page:list-item> <page:list-item-body>A</page:list-item-body> </page:list-item> <page:list-item> <page:list-item-body>B</page:list-item-body> </page:list-item> </page:list> </page:body> </page:page>
Output:
=== List: ===\n * A\n * B\n
List:
- A
- B
Sunday moinpage_span tests
Another working item: span
stroke
underline
larger
smaller
superscript
subscript
Span test
Input tree:
<page:page> <page:body> <page:h page:outline-level="3">Span:</page:h> \n <page:span page:text-decoration="line-through">stroke</page:span> \n <page:span page:text-decoration="underline">underline</page:span> \n <page:span page:font-size="120%">larger</page:span> \n <page:span page:font-size="85%">smaller</page:span> \n <page:span page:baseline-shift="super">super</page:span>script \n <page:span page:baseline-shift="sub">sub</page:span>script \n </page:body> </page:page>
Output:
=== Span: ===\n--(stroke)--\n__underline__\n~+larger+~\n~-smaller-~\n^super^script\n,,sub,,script\n
Span:
stroke underline larger smaller superscript subscript
Moinwiki->DOM->Moinwiki tests
Tests:
1. "=== Text: ===\n'''strong'''\n''emphasis''\n`monospace`\n" 2. "=== Table: ===\n||A||B||<|2>D||\n||||C||\n" 3. "=== List: ===\n * A\n 1. C\n 1. D\n" 4. "=== Span: ===\n--(stroke)--\n__underline__\n~+larger+~\n~-smaller-~\n^super^script\n,,sub,,script\n" 5. " * A\n * B\n * C\n * D\n * E\n * F\n" 6. " * A\n * B\n i. C\n i. D\n 1. E\n 1. F\n i. G\n 1. H\n" 7. "=== A ===\n dsfs:: dsf\n :: rdf\n :: sdfsdf\n :: dsfsf\n" 8. "=== A ===\n css::\n :: rdf\n :: sdfsdf\n :: dsfsf\n" 9. "=== A ===\n css:: \n :: rdf\n :: sdfsdf\n :: dsfsf\n"
Problem with {{{blockcode}}}, moinwiki_in converts it to moin_page:code, equals with `monospace`
Test 8. fails, moinwiki_in does not recognise this input as definition list.
In Test 9. first list item is ' '. We need some changes in moinwiki_in.
For now, moinwiki_out supports definition list, but only one output format:
this:: :: A :: B not_this:: def
Moinwiki->DOM->Moinwiki: problems
Problems:
In moinmoin_in converter with this input
[[http://static.moinmo.in/logos/moinmoin.png|{{attachment:samplegraphic.png}}]]
in outputed tree {{attachment:samplegraphic.png}} is a text, not an object.
[[http://moinmo.in/|MoinMoin Wiki|class=green dotted,accesskey=1]]
no class=green dotted,accesskey=1 after moinwiki_in
But in other cases it seems that Moinwiki->DOM->Moinwiki conversion of links and objects is working.
And i wasn't right:)
MoinMoin:MoinMoinWiki|MoinMoin Wiki|&action=diff,&rev1=1,&rev2=2 is not working now. works now.
Another problem in moinwiki_in: <page:separator> don't have any attributes, no difference between:
etc
"A::\n :: B\n :: C\n :: D\n" this format of definition list does'n work in moinwiki_in. "A:: B\n :: C\n :: D\n" this works
All these problems are only from moinwiki_in converter.
It's time to prepare for last exam
I have graduation exam next friday, so you would not see any changes in my GSoC project next 3 days.
New logic for newline in moinpage_p, {{{#!wiki ... }}} support
New variables: status = list of ['text','table','list'] last_closed - last closed DOM element.
In text <p> -> "\n" (if not at the beginning of the page)
In tables and lists <p> inside cells and list items -> <<BR>>
Added support of {{{#!wiki ... }}} (moinpage_page inside moinpage_page).
Macros and more bugs in moinmoin_in
First dirty realization of <<SomeMacro(args)>>
and more bugs in Moinwiki_in bugs
Test day
Wrote a lot of tests for moinwiki_out, and fixed some bugs.
Last elements support
DOM->Moinwiki
Added support of <note> and <table-of-content>
Rewrote implementation of <part> (macros)
Merge with the main 2.0-dev repo
Small bugfixes after merge and new implementation of parsers(<page:part page:content-type="x-moin/format;name=XXX">...</page:part>) conversion.
48 different tests of moinwiki_out passed
Conversion of HelpOnMoinWikiSyntax and subpages|subblockcodes
- First conversion of real page
Bugfixes and support of subpages|subblockcodes:
long long {{{{{{{{{{{{ and }}}}}}}}}}}} with nested blocks
reST converter
I've started working on reST converter.
Done with first quick&dirty implementation: emphasis, strong, literals(monospace), blockcode, table, list
DOM->reStructuredText and problem with unicode in converter tests
Added conversion of <part> (macros), <note> (footnotes), <line-break>.
When i try to do some converter tests with unicode input:
> self._parser.Parse(data, 0) E UnicodeEncodeError: 'ascii' codec can't encode characters in position 138-142: ordinal not in range(128) MoinMoin/support/emeraldtree/tree.py:1146: UnicodeEncodeError
and the answer is --> ET.XML(i.encode("utf-8"))
Exam
I've passed my first entrance exam to graduate school(PhD)
DOM->ReST: objects and links
Added basic support of objects and links to rst_out converter
ReStructuredText -> DOM
i've started to think, how to implement rst_in converter. Moin parser based on docutils rst2html, With docutils parser i can create Write(docutils.writers.Writer) class, that will output MoinDOM tree after docutils parser, bu maybe it would be easier to write DocutilsDOM->MoinDOM converter.
ReStructuredText->DOM
Implemented basic structure of rst_in converter
Rst->DOM
I work on conversion of docutils DOM to moin DOM. I need more time before hg push because it has a lot of node types and i want to push some working version
Rst->DOM
I've added implementation for basic nodes of the docutils tree, but rst_in still doesn't work as converter.
I have to read documentation on basic functions in docutils.core.
ReST->DOM
Docutils part of converter works.
I need more tests to done with docutils tree -> moin tree conversion.
ReStructuredText->DOM
Added Moin directives to docutils parser in converter.
Implemented basic tests.
Exam, ReStructuredText->DOM
I've passed my second entrance exam(english) to graduate school(PhD)
New nodes support in rst_in: table, link, footnote. (with tests)
Page test for ReStructuredText roundtrip test
Now I have it: DmitryAndreev/Diary/RstPrimerConversion
it's awful, but it would help me to fix errors.
ReStructuredText conversion
Fixed table_of_content, blockcode and shift in lists.
See updates of DmitryAndreev/Diary/RstPrimerConversion
ReStructuredText conversion
Fixed the problem with equal names of references in ReStructuredText output.
Added conversion of docinfo part to a table.
Added conversion of blockqoute to a list.
RstPrimerConversion looks very good
TODO:
Create more unit tests for ReStructuredText->DOM and DOM->ReStructuredText
- Fix pep8 in rst_in and rst_out converters
- Copy the part of code related to shifts in lists to moinwiki_out converter.
- Write a lot of docstrings for all my converters
DOM->ReStructuredText and DOM->Moinwiki: Fix indents in lists, now they are perfect
PEP8 fixes and merge with main 2.0 branch
no project work
I've passed my last entrance exam, no project progress this day.
ReStructuredText->DOM bugfixes
Added directive for moinwiki parsers
More tests and various bugfixes
DOM->ReStructuredText
Various bugfixes and more tests
= DOM->ReStructuredText =
More tests and various bugfixes
Coverage of the tests:
moinwiki_out: |
89% |
rst_in: |
90% |
rst_out: |
94% |
Docstrings for ReStructuredText converters
midterm evals
Added recursive version of DOM->Moinwiki converter.
Added recursive version of DOM->ReStructuredText converter
Create basic structure of Mediawiki->DOM converter based on parser from mwlib
Sorry moin, last two weeks a had a brain f*ck with year science report and bureaucracy, most of the people who must to help me with this just waved their hands and get out to have their vacations.
What i've done this time:
I've delete Mediawiki->DOM converter based on mwlib, while i've testing i've found that mwlib parser results does'n correspond mwlib internal tree specification.
I've write basic Mediawiki->DOM conversion using regexp like in Moinwiki->DOM.
- Conversion of tt/code/pre tags
- Table attributes
- Multiline text in table cell
- Conversion of line_break
- Conversion of external links, images
More tests of Mediawiki->DOM converter
- fix conversion of images