Contents
There are two approaches to do this:
The most recent one is the approach that was used to migrate all of http://www.gnewsense.org/ from PmWiki 2.1.17 to MoinMoin. This can be used to migrate all pages including the revision history, as well as the user accounts.
The other approach relies on a custom regular expression-based parser for PmWiki and will only convert the most recent page versions.
Migrating PmWiki to MoinMoin (as used for gNewSense)
When gNewSense GNU/Linux migrated their web site from PmWiki to MoinMoin, I volunteered to write a script to get the job done. These were the requirements:
- Convert ~1500 pages with attachments without losing any data or formatting
Migrate 4 years worth of revision history (>10000 edits)
- Import ~2500 user accounts
The versions that we used on the gNewSense web site were PmWiki 2.1.27 and MoinMoin 1.7.3 (also tested with 1.9.3).
You can find the script and additional information on my web site: http://realmike.org/blog/2010/10/25/migrating-from-pmwiki-to-moinmoin/
-- MichaelFoetsch 2010-10-25 15:03:08
Older PmWiki to MoinMoin converter
In our department we have switched from the nice and easy, but not very richly featured PmWiki to MoinMoin. Migration was a major concern. So inspired by the PhpWikiMoinMoinConverter by TheAnarcat I have created a migration script.
The whole regular expression magic has been rewritten and of course data base access was not an issue for either PmWiki or MoinMoin.
The script is not ready for upload, yet, but will be soon.
Usage
Call this script directly from the wiki directory. You must have write access to ./data and subdirectories to create pages. You will need to change the CONFIGURATION DIRECTIVES section.
Special Considerations
This script will happily destroy existing wikis if it feels like it, so backups are of course advised before performing a conversion. Note that this script will just create the new pages, and overwrite any existing pages, so you better backup or die.
Limitations
This script is also crucially incomplete, and most definitely lacks several PmWiki and MoinMoin features. Some are due to limitations of the wiki, others are due to the inherent ambiguity of the wiki syntax, and still others are just due to the fact that converters have not been implemented, yet.
Additionally, this script has been created for the sake of converting a wiki site running PmWiki 0.5.27, current versions are around 2.0.x ...
A summary of some limitations:
- linking
- not all link styles are supported (e. g. "free links" with suppressions)
- "free links" may be flaky due to name conversion changes in the two systems
- wiki pages containing "special characters" (e. g. "-") will need to be renamed manually (in the file system)
- link anchors are not implemented
WikiWords prepended by a number, underscore, etc. will still be inter preted as a WikiWord (though without the "prefix")
on PmWiki's "HomePage" wiki links do not point relatively to the WikiGroups corresponging pages, as PmWiki's "HomePage" is a sub page to the WikiGroup, whereas MoinMoin can attache content to the group itself
simple WikiWords that produce links in PmWiki may not produce wiki links in MoinMoin (words with double caps, numbers, or words ending in caps)
- wiki page includes
- table syntax is messy, and will not be converted, but warned about
- quite some macros might not work
- in case of existing wiki pages, it does not edit the content, but it actually replaces the content with ID 1 and sets the ID to 1
- no automatic migration of attachments and images
- fancy formatings in term/definition lists will screw up the list
MoinMoin doesn't support wiki syntax e. g. in head lines
Resources
MoinMoin Wiki syntax: http://moinmoin.wikiwikiweb.de/HelpOnEditing
PmWiki syntax from setup site in PmWiki setup: PmWiki/DocumentationIndex
How to create the MoinMoin pages
MoinMoin page creation is the "raw way" as I had problems using MoinMoin modules for the tasks as PhpWikiMoinMoinConverter does (due to a lack of MoinMoin internal knowlege).
reading the PmWiki's data directory for all wiki pages, clearing out some internally used pages
- checking the body for some troublesome wiki tags to notify the user afterwards in a summary
- processing the body with a whole bunch of regular expressions
- creating the pages dramatical unelegantly with brute force in the file system with revision ID 1 (no history, recent changes, etc. available)
Make sure to study the file's top section for some configuration paths, PmWiki host name setup, etc.
Implementation
1 #!/usr/bin/env python
2
3 """
4 Copyright (c) 2005 Guy K. Kloss <guy.kloss@dlr.de>
5
6 This program is free software; you can redistribute it and/or modify
7 it under the terms of the GNU General Public License as published by
8 the Free Software Foundation; either version 2 of the License, or
9 any later version.
10
11 This program is distributed in the hope that it will be useful,
12 but WITHOUT ANY WARRANTY; without even the implied warranty of
13 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14 GNU General Public License for more details.
15
16 You should have received a copy of the GNU General Public License
17 along with this program; if not, write to the Free Software
18 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
19
20 See also http://www.fsf.org
21
22 ------------------------------------------------------------------------
23 PmWikiMoinMoinConverter
24
25 Inspired by the PhpWikiMoinMoinConverter by "The Anarcat".
26
27 = Usage =
28
29 Call this script directly from the wiki directory. You must have write
30 access to ./data and subdirectories to create pages. You will need to
31 change the CONFIGURATION DIRECTIVES, below.
32
33 = Special Considerations =
34
35 This script will happily destroy existing wikis if it feels like it,
36 so backups are of course advised before performing a conversion. Note
37 that this script will just create the new pages, and overwrite any
38 existing pages, so you better backup or die.
39
40 = Limitations =
41
42 This script is also crucially incomplete, and most definitely lacks
43 several PmWiki and MoinMoin features. Some are due to limitations of
44 the wiki, others are due to the inherent ambiguity of the wiki syntax,
45 and still others are just due to the fact that converters have not been
46 implemented, yet.
47
48 Additionally, this script has been created for the sake of converting a
49 wiki site running PmWiki 0.5.27, current versions are around 2.0.x ...
50
51 A summary of some limitations:
52
53 * linking
54 * not all link styles are supported (e. g. "free links" with suppressions)
55 * "free links" may be flaky due to name conversion changes in the two systems
56 * wiki pages containing "special characters" (e. g. "-") will need to be
57 renamed manually (in the file system)
58 * link anchors are not implemented
59 * WikiWords prepended by a number, underscore, etc. will still be inter
60 preted as a WikiWord (though without the "prefix")
61 * on PmWiki's "HomePage" wiki links do not point relatively to the
62 WikiGroups corresponging pages, as PmWiki's "HomePage" is a sub
63 page to the WikiGroup, whereas MoinMoin can attache content to the
64 group itself
65 * simple WikiWords that produce links in PmWiki may not produce wiki
66 links in MoinMoin (words with double caps, numbers, or words ending
67 in caps)
68 * wiki page includes
69 * table syntax is messy, and will not be converted, but warned about
70 * quite some macros might not work
71 * in case of existing wiki pages, it does not edit the content, but it
72 actually replaces the content with ID 1 and sets the ID to 1
73 * no automatic migration of attachments and images
74 * fancy formatings in term/definition lists will screw up the list
75 * MoinMoin doesn't support wiki syntax e. g. in head lines
76
77 = Resources =
78
79 * MoinMoin Wiki syntax: http://moinmoin.wikiwikiweb.de/HelpOnEditing
80 * PmWiki syntax from setup site in PmWiki setup: PmWiki.DocumentationIndex
81 """
82
83 # CONFIGURATION DIRECTIVES
84 #
85 # The path to the MoinMoin wiki, leave empty if you know what you're
86 # doing
87 moinWikiPath = '/tmp/moin-desktop'
88
89 # The path to the PmWiki directory:
90 pmWikiPath = '/tmp/pmwiki'
91
92 # PmWiki server:
93 pmWikiHost = 'thor.sistec.kp.dlr.de'
94 pmWikiBase = '/wiki/pmwiki.php'
95
96 moinAcl = ''# '#acl VkAdminGroup:read,write,delete,revert,admin VkUserGroup:read,write,delete,revert All:'
97
98 # By default, we do not edit existing wiki pages, to avoid conflict.
99 #
100 # This will override this behavior and allow the edition of those
101 # pages. normally, a new revision should be created, so this is pretty
102 # safe.
103 editExistingPages = True
104
105 #
106 # END OF CONFIGURATION DIRECTIVES
107
108 import re
109 import os
110 import os.path
111 import sys
112 import httplib
113
114 class PmConverter:
115 # Warning list.
116 # Aabout these things the user should be warned after conversion to manually
117 # fix certain things.
118 warnings = [['\|\|', # tables: would be a conversion mess
119 'Tables in PmWiki, look at http://moinmoin.wikiwikiweb.de/HelpOnTables'],
120 ['\[\[table', # advanced tables, more messy
121 'Advanced tables in PmWiki, look at http://moinmoin.wikiwikiweb.de/HelpOnTables'],
122 ['{{(.+?)}}',
123 '"Free links" may have different names on the system.'],
124 ['HomePage',
125 'No checks for HomePage -> FrontPage conversion.']
126 ]
127
128 # Markup conversion list.
129 # (Note: The order within this list, and thus the conversion order,
130 # may be critical on some conversions, e. g. on head lines.).
131 converters = [### --- block conversions --------------------------------
132 {'lable': 'vb', # verbatim block
133 'converter':
134 [{'search': '^ +\[=(.*?)=\]',
135 'replace': r'%%vb<\1>vb%%'}]},
136 {'lable': 'nb', # remove line break preventions with new line
137 'converter':
138 [{'search': r'\\$\n',
139 'replace': r''}]},
140 ### --- complete line conversions ------------------------
141 {'lable': 'vl', # verbatim line
142 'converter':
143 [{'search': '^ (.*?)$',
144 'replace': r'%%vl<\1>vl%%'},
145 {'search': '%%vl<(.*?)>vl%%',
146 'replace': r'`\1`'}]},
147 {'lable': 'h4', # heading 4
148 'converter':
149 [{'search': '^!!!! *?(.*?)$',
150 'replace': r'%%h4<\1>h4%%'},
151 {'search': '%%h4<(.*?)>h4%%',
152 'replace': r'==== \1 ===='}]},
153 {'lable': 'h3', # heading 3
154 'converter':
155 [{'search': '^!!! *?(.*?)$',
156 'replace': r'%%h3<\1>h3%%'},
157 {'search': '%%h3<(.*?)>h3%%',
158 'replace': r'=== \1 ==='}]},
159 {'lable': 'h2', # heading 2
160 'converter':
161 [{'search': '^!! *?(.*?)$',
162 'replace': r'%%h2<\1>h2%%'},
163 {'search': '%%h2<(.*?)>h2%%',
164 'replace': r'== \1 =='}]},
165 {'lable': 'h1', # heading 1
166 'converter':
167 [{'search': '^! *?(.*?)$',
168 'replace': r'%%h1<\1>h1%%'},
169 {'search': '%%h1<(.*?)>h1%%',
170 'replace': r'= \1 ='}]},
171 {'lable': 'b4', # bullet 4
172 'converter':
173 [{'search': '^\*\*\*\* *?(.*?)$',
174 'replace': r'%%b4<\1>b4%%'},
175 {'search': '%%b4<(.*?)>b4%%',
176 'replace': r' * \1'}]},
177 {'lable': 'b3', # bullet 3
178 'converter':
179 [{'search': '^\*\*\* *?(.*?)$',
180 'replace': r'%%b3<\1>b3%%'},
181 {'search': '%%b3<(.*?)>b3%%',
182 'replace': r' * \1'}]},
183 {'lable': 'b2', # bullet 2
184 'converter':
185 [{'search': '^\*\* *?(.*?)$',
186 'replace': r'%%b2<\1>b2%%'},
187 {'search': '%%b2<(.*?)>b2%%',
188 'replace': r' * \1'}]},
189 {'lable': 'b1', # bullet 1
190 'converter':
191 [{'search': '^\* *?(.*?)$',
192 'replace': r'%%b1<\1>b1%%'},
193 {'search': '%%b1<(.*?)>b1%%',
194 'replace': r' * \1'}]},
195 {'lable': 'n4', # numbering 4
196 'converter':
197 [{'search': '^#### *?(.*?)$',
198 'replace': r'%%n4<\1>n4%%'},
199 {'search': '%%n4<(.*?)>n4%%',
200 'replace': r' 1. \1'}]},
201 {'lable': 'n3', # numbering 3
202 'converter':
203 [{'search': '^### *?(.*?)$',
204 'replace': r'%%n3<\1>n3%%'},
205 {'search': '%%n3<(.*?)>n3%%',
206 'replace': r' 1. \1'}]},
207 {'lable': 'n2', # numbering 2
208 'converter':
209 [{'search': '^## *?(.*?)$',
210 'replace': r'%%n2<\1>n2%%'},
211 {'search': '%%n2<(.*?)>n2%%',
212 'replace': r' 1. \1'}]},
213 {'lable': 'n1', # numbering 1
214 'converter':
215 [{'search': '^# *?(.*?)$',
216 'replace': r'%%n1<\1>n1%%'},
217 {'search': '%%n1<(.*?)>n1%%',
218 'replace': r' 1. \1'}]},
219 {'lable': 'i4', # indention 4
220 'converter':
221 [{'search': '^:::: :(.*?)$',
222 'replace': r'%%i4<\1>i4%%'},
223 {'search': '%%i4<(.*?)>i4%%',
224 'replace': r' \1'}]},
225 {'lable': 'i3', # indention 3
226 'converter':
227 [{'search': '^::: :(.*?)$',
228 'replace': r'%%i3<\1>i3%%'},
229 {'search': '%%i3<(.*?)>i3%%',
230 'replace': r' \1'}]},
231 {'lable': 'i2', # indention 2
232 'converter':
233 [{'search': '^:: :(.*?)$',
234 'replace': r'%%i2<\1>i2%%'},
235 {'search': '%%i2<(.*?)>i2%%',
236 'replace': r' \1'}]},
237 {'lable': 'i1', # indention 1
238 'converter':
239 [{'search': '^: :(.*?)$',
240 'replace': r'%%i1<\1>i1%%'},
241 {'search': '%%i1<(.*?)>i1%%',
242 'replace': r' \1'}]},
243 {'lable': 'td', # term/definition
244 'converter':
245 [{'search': '^::([^: ]+[^:]*?):(.*?)$',
246 'replace': r'%%td< \1::\2>td%%'},
247 {'search': '%%td<(.*?)>td%%',
248 'replace': r'\1'}]},
249 {'lable': 'hr', # horizontal rule
250 'converter':
251 [{'search': '^-----*?(.*?)',
252 'replace': r'----\1'}]},
253
254 ### --- in-line conversions ------------------------------
255 ## --- formatting ---
256 {'lable': 'br', # line break
257 'converter':
258 [{'search': '\[\[<<\]\]',
259 'replace': r'%%ma<BR>ma%%'}]},
260 {'lable': 'cd', # monospaced, no wiki highlighting
261 'converter':
262 [{'search': '@@\[=(.*?)=\]@@',
263 'replace': r'%%cd<\1>cd%%'},
264 {'search': '%%cd<(.*?)>cd%%',
265 'replace': r'{{{\1}}}'}]},
266 {'lable': 'tt', # monospaced (also no wiki highlighting)
267 'converter':
268 [{'search': '@@(.*?)@@',
269 'replace': r'%%tt<\1>tt%%'},
270 {'search': '%%tt<(.*?)>tt%%',
271 'replace': r'`\1`'}]},
272 {'lable': 'nh', # don't know how to suppress wiki syntax interpretation any better, not nice, though ...
273 'converter':
274 [{'search': '\[=(.*?)=\]',
275 'replace': r'%%nh<\1>nh%%'},
276 {'search': '%%nh<(.*?)>nh%%',
277 'replace': r'`\1`'}]},
278 # bold is equivalent "'''" (the same)
279 # italics is equivalent "''" (the same)
280 # underline: not present in PmWiki
281 {'lable': 'fl', # larger
282 'converter':
283 [{'search': '\[\+(.*?)\+\]',
284 'replace': r'%%fl<\1>fl%%'},
285 {'search': '%%fl<(.*?)>fl%%',
286 'replace': r'~+\1+~'}]},
287 {'lable': 'fs', # smaller
288 'converter':
289 [{'search': '\[-(.*?)-\]',
290 'replace': r'%%fs<\1>fs%%'},
291 {'search': '%%fs<(.*?)>fs%%',
292 'replace': r'~-\1-~'}]},
293 {'lable': 'su', # superscript
294 'converter':
295 [{'search': '\^\^(.*?)\^\^',
296 'replace': r'%%su<\1>su%%'},
297 {'search': '%%su<(.*?)>su%%',
298 'replace': r'^\1^'}]},
299 {'lable': 'sb', # subscript
300 'converter':
301 [{'search': '__(.*?)__',
302 'replace': r'%%sb<\1>sb%%'},
303 {'search': '%%sb<(.*?)>sb%%',
304 'replace': r',,\1,,'}]},
305
306 ## --- linking ---
307 # WikiWord is the same, a good regex for them in PmWiki: '(([A-Z][a-z0-9]*){2,})'
308 {'lable': 'ld1', # [[WikiGroup/WikiWord descriptive text]] or [[WikiWord descriptive text]]
309 'converter':
310 [{'search': '\[\[(?P<link>([A-Za-z0-9]*)(/[A-Za-z0-9]*)?) (?P<desc>.+?)\]\]',
311 'replace': r'%%ld<:../\g<link>:\g<desc>>ld%%'}]},
312 {'lable': 'ld2', # [[WikiGroup.WikiWord descriptive text]]
313 'converter':
314 [{'search': '\[\[(?P<group>[A-Za-z0-9]*)\.(?P<word>[A-Za-z0-9]*) (?P<desc>.+?)\]\]',
315 'replace': r'%%ld<:../\g<group>/\g<word>:\g<desc>>ld%%'}]},
316 {'lable': 'lf1', # free links w/ WikiGroups
317 'converter':
318 [{'search': '(?P<group>([A-Z][a-z0-9]*){2,})[./]{{(?P<word>.+?)}}',
319 'replace': r'%%lf</\g<group>/\g<word>>lf%%'}]},
320 {'lable': 'lf2', # free links
321 'converter':
322 [{'search': '([^{./]){{(.+?)}}([^}])',
323 'replace': r'\1%%lf<../\2>lf%%\3'}]},
324 # URL includes (images, too) work out of the box
325 {'lable': 'lu', # URL w/ alternative link text
326 'converter':
327 [{'search': '\[\[((http|ftp|mailto)\S+) (.+?)\]\]',
328 'replace': r'%%ld<\1 \3>ld%%'}]},
329 {'lable': 'afd', # attached file w/ alternative link text
330 'converter':
331 [{'search': '\[\[Attach:(\S+?) (.+?)\]\]',
332 'replace': r'%%ld<attachment:../\1 \2>ld%%'}]},
333 {'lable': 'ai', # inline attached images
334 'converter':
335 [{'search': 'Attach:(\S+?\.(jpg|jpeg|png|gif)) ',
336 'replace': r'inline:../\1 '}]},
337 {'lable': 'af', # attached files
338 'converter':
339 [{'search': 'Attach:(\S+?) ',
340 'replace': r'attachment:../\1 '}]},
341 {'lable': 'hp', # HomePage (wild removal without checks for any exceptions)
342 'converter':
343 [{'search': '[./]HomePage',
344 'replace': r''}]},
345 {'lable': 'ls', # simple WikiWord link
346 'converter':
347 [{'search': '([^/\w.:<["])([A-Z][a-z]([A-Z][a-z0-9]*)+)',
348 'replace': r'\1%%li<../\2>li%%'}]},
349 ## --- macros ---
350 {'lable': 'ma', # list of attachments for current page
351 'converter':
352 [{'search': '\[\[\$AttachList\]\]',
353 'replace': r'%%ma<AttachList>ma%%'}]},
354 {'lable': 'ms', # search field
355 'converter':
356 [{'search': '\[\[\$Search\]\]',
357 'replace': r'%%ma<FullSearch>ma%%'}]},
358
359 ## --- cleanups, fixes, and finalizations ---
360 # possible link correction for certain wiki links
361 {'lable': 'x1', # WikiWord ending in cap (e. g. "LaTeX")
362 'converter':
363 [{'search': '%%li<(.+?[A-Z])>li%%',
364 'replace': r'%%lf<\1>lf%%'}]},
365 {'lable': 'x2', # WikiGroup ending in cap (e. g. "LaTeX")
366 'converter':
367 [{'search': '%%li<(.+?[A-Z]/.+?)>li%%',
368 'replace': r'%%lf<\1>lf%%'}]},
369 {'lable': 'x3', # WikiWord w/ double caps may not link (e. g. "WikiWWord")
370 'converter':
371 [{'search': '%%li<(.*?[A-Z]{2,}.*?)>li%%',
372 'replace': r'%%lf<\1>lf%%'}]},
373 {'lable': 'x4', # WikiWord w/ numbers may not link (e. g. "Wiki2wiki")
374 'converter':
375 [{'search': '%%li<(.*?[0-9].*?)>li%%',
376 'replace': r'%%lf<\1>lf%%'}]},
377
378 # other finalizations
379 {'lable': 'ym', # put double brackets around macros
380 'converter':
381 [{'search': '', # intentionally blank, should trigger in second pass
382 'replace': r''},
383 {'search': '%%ma<(.*?)>ma%%',
384 'replace': r'[[\1]]'}]},
385 {'lable': 'yv', # put "{{{" and "}}}" around verbatim blocks
386 'converter':
387 [{'search': '%%vb<(.*?)>vb%%',
388 'replace': r'{{{\1}}}'}]},
389
390 # link finalization
391 {'lable': 'z0', # Wiki link change "." to "/"
392 'converter':
393 [{'search': '%%li<(.*?)\.(.*?)>li%%',
394 'replace': r'%%li<\1/\2>li%%'}]},
395 {'lable': 'z1', # Wiki link finalization
396 'converter':
397 [{'search': '', # intentionally blank, should trigger in second pass
398 'replace': r''},
399 {'search': '%%li<(.*?)>li%%',
400 'replace': r'\1'}]},
401 {'lable': 'z2', # Wiki free link finalization
402 'converter':
403 [{'search': '', # intentionally blank, should trigger in second pass
404 'replace': r''},
405 {'search': '%%lf<(.*?)>lf%%',
406 'replace': r'["\1"]'}]},
407 {'lable': 'z3', # Wiki link w/ description finalization
408 'converter':
409 [{'search': '', # intentionally blank, should trigger in second pass
410 'replace': r''},
411 {'search': '%%ld<(.*?)>ld%%',
412 'replace': r'[\1]'}]},
413 {'lable': 'z4', # WikiGroup.WikiWord (brutally do this "Word.Word" -> "["/Word/Word"]")
414 'converter':
415 [{'search': '', # intentionally blank, should trigger in second pass
416 'replace': r''},
417 {'search': '([A-Z][a-zA-Z0-9]*)\.([A-Z][a-zA-Z0-9]*)',
418 'replace': r'["/\1/\2"]'}]}
419 ]
420
421 # Fix for straying wiki syntax within a verbatim block
422 verbatimBlockFix = {'find': re.compile('^%%vb<(.*?)>vb%%', re.MULTILINE | re.DOTALL),
423 'kill': re.compile('%%[a-z][a-z0-9]<|>[a-z][a-z0-9]%%', re.MULTILINE | re.DOTALL)}
424
425
426 def __init__(self):
427 self.warningMessages = []
428 self.moinWikiPages = os.path.join(moinWikiPath, 'wiki', 'data', 'pages')
429
430 self.actions = []
431
432 # Generate action list and compile regular expressions:
433 for converter in self.converters:
434 for i in range(len(converter['converter'])):
435 item = converter['converter'][i]
436 item['search'] = re.compile(item['search'], re.MULTILINE | re.DOTALL)
437 action = {}
438 action.update(item)
439 action['lable'] = converter['lable']
440 if (i+1) > len(self.actions):
441 self.actions.append([])
442 self.actions[i].append(action)
443
444 for item in self.warnings:
445 item[0] = re.compile(item[0], re.MULTILINE | re.DOTALL)
446
447 # Some variables to ease the processing:
448 self.currentPage = []
449 self.currentPageName = ''
450
451
452 def convertPages(self):
453 """
454 The whole conversion magic.
455 """
456 pages = self.getWikiPages()
457
458 print 'Processing ...'
459 for page in pages:
460 print ' ', '/'.join(page)
461
462 self.currentPage = []
463 self.currentPage.extend(page)
464 if self.currentPage[1] == 'HomePage':
465 self.currentPage.pop(1)
466 self.warningMessages.append(self.currentPage
467 + ['Wiki links to and from converted WikiGroup.HomePage might be broken.'])
468 self.currentPageName = '(2f)'.join(self.currentPage)
469
470 # Do the actual parsing of this page and save it.
471 text = self.blockParser(self.getSource(page))
472
473 # Now put the converted page into MoinMoin.
474 self.makePage(text)
475
476 print '\n\nWarnings: '
477 self.warningMessages.sort()
478 for item in self.warningMessages:
479 print '%s: %s' % (item[0], item[1])
480
481
482 def getWikiPages(self):
483 """
484 Retrieves a list of all still relevant pages, each in the form
485 [WikiGroup, WikiWord].
486 """
487 allPages = os.listdir(os.path.join(pmWikiPath, 'wiki.d'))
488 relevantPages = []
489
490 for page in allPages:
491 fragments = page.split(',')
492 if len(fragments) == 1:
493 fragments = fragments[0].split('.')
494 if fragments[1] not in ('htaccess', 'flock', 'mailposts',
495 'RecentChanges', 'RecentUploads',
496 'AllRecentChanges', 'AllRecentUploads',
497 'SearchWiki', 'WebMenu'):
498 relevantPages.append(fragments)
499
500 return relevantPages
501
502
503 def getSource(self, page):
504 """
505 Uses a HTTP request to the installed PmWiki to retrieve the
506 wiki source of the individual page.
507 """
508 wikiGroup, wikiPage = page
509 url = '/'.join((pmWikiBase, wikiGroup, wikiPage)) + '?action=source'
510 connection = httplib.HTTPConnection('thor.sistec.kp.dlr.de')
511 connection.request('GET', url)
512 response = connection.getresponse()
513 if response.status != 200:
514 self.warningMessages.append(page + ['Could retrieve the original wiki page.'])
515 print 'HTTP error:', response.status, response.reason,
516 source = response.read()
517 connection.close()
518
519 return source.strip()
520
521
522 def blockParser(self, text):
523 """
524 The block parser deals with the whole text to be converted.
525 It will call the line parser for each line in the text.
526 """
527 # Check for things to be aware of and warn about later on.
528 for warning in self.warnings:
529 if warning[0].search(text):
530 self.warningMessages.append(['/'.join(self.currentPage), warning[1]])
531
532 # Get some block conversion done before converting on line level.
533 for conversionPass in self.actions:
534 for conversion in conversionPass:
535 # Repair errors in verbatim block conversion
536 if conversion['lable'] == 'yv' and conversion['replace']:
537 text = self.verbatimBlockFix['find'].sub(self._removeCurlyBrackets, text)
538
539 text = conversion['search'].sub(conversion['replace'], text)
540
541 return text
542
543
544 def _removeCurlyBrackets(self, match):
545 """
546 This one removes the unwanted straying "{{{" and "}}}" from verbatim blocks.
547 """
548 return '%%%%vb<%s>vb%%%%' % self.verbatimBlockFix['kill'].sub('', match.group())
549
550
551 def makePage(self, text):
552 currentFilePath = os.path.join(self.currentPageName, 'current')
553 revisionsPath = os.path.join(self.currentPageName, 'revisions')
554 revision = '00000001'
555 contentFilePath = os.path.join(revisionsPath, revision)
556
557 # Overwriting pages if selecting only some.
558 if not os.path.exists(self.currentPageName) or editExistingPages:
559 if not os.path.exists(self.currentPageName):
560 try:
561 os.mkdir(self.currentPageName)
562 os.mkdir(revisionsPath)
563 except OSError, err:
564 self.warningMessages.append('/'.join(self.currentPage)
565 + ['Could not create a directory needed for the wiki page.'])
566 print err,
567
568 text = unicode(text, "latin1").encode("utf8")
569 try:
570 # This will be the page revision ID.
571 fileHandler = open(currentFilePath, 'w')
572 fileHandler.write(revision)
573 fileHandler.close()
574 # This will be the content for that ID.
575 fileHandler = open(contentFilePath, 'w')
576 fileHandler.write(moinAcl + '\n' + text)
577 fileHandler.close()
578 except Exception, err:
579 self.warningMessages.append('/'.join(self.currentPage)
580 + ['Could not write the content of the wiki page.'])
581 print err,
582 else:
583 print '(*** already exists, skipping ***)',
584
585 if __name__ == '__main__':
586 converter = PmConverter()
587 converter.convertPages()
588
589