sakana2moin.py

Abstract

Author: DanSandler
Usage: sakana2moin.py <sakana-files-dir> <moin-files-dir>

Limitations/Bugs

Doesn't deal well with UTF-8 pagenames -- you end up with _XX_XX_XX.

You can use or copy i18n.recode

About Sakana

Sakana (魚 -- Japanese for "fish") is a Python wikisystem written (as a "hello

Implementation

   1    2    3    4    5    6    7    8    9   10   11 <   12   13    14    15    16    17    18    19    20    21    22    23    24    25    26    27    28    29    30    31    32    33    34    35    36    37    38    39    40    41   42    43    44    45    46    47   48    49    50    51    52   53    54   55    56    57    58    59    60    61   62    63    64    65    66    67    68    69    70    71    72   73    74   75    76    77    78    79    80   81    82    83    84    85   86    87    88    89    90   91    92    93    94    95    96    97    98    99   100   101   102   103   104   105   106   107   108   109   110   111   112   113   114   115   116   117   118   119   120   121   122   123   124   125   126   127   128   129   130   131   132   133   134   135   136   137   138   139   140   141   142   143   144   145   146   147   148   149   150   151   152   153   154   155   156   157   158   159   160   161   162   163   164   165   166   167   168   169   170   171   172   173   174   175   176   177   178   179   180   181   182   183   184   185   186   187   188   189   190   191   192   193   194   195   196   197   198   199   200  201   202   203   204   205   206  207   208   209   210   211   212  213   214   215   216   217   218   219   220   221   222   223   224   225   226   227   228   229   230   231   232   233   234   235   236   237   238   239   240   241   242   243   244   245   246   247   248   249   250   251   252   253   254   255   256   257   258   259   260  261   262   263   264   265   266   267   268   269   270   271   272   273   274   275   276   277   278   279   280   281   282   283   284   285   286   287   288   289   290   291   292   293

sakana2moin.py Convert pages from the Sakana wiki syntax to MoinMoin markup. Requires shell access to the Moin installation, because it actually creates a pile of files which need to be moved into <moin>/data/text.

For Sakana pages with comments, the output MoinMoin page will have the comments appended to its body (with --SomeName at TIME and horz. rules). Special (user) pages have the (user) part stripped.

to recode from the original name to utf8

{mailto} macro not detected and converted to MM [[MailTo]].
Totally gives up on fancy (meta) and (topic) pages.
Should use MoinMoin's date/time macros in comment conversion.
Has a bunch of crazy macro stuff in it specific to our Sakana and Moin installations; it won't hurt you if you use it on your stock Sakana installation, but chances are you don't have the {change} or {quote} macros...

world", no less!) by Brian Swetland. It has some really interesting features, including:

a powerful comment system that automatically creates new pages for comments and includes them inline in the master page
ownership of pages
ACL and user "classes" (editor, read-only, etc.)

Of course, Brian's not maintaining it anymore, and the installation is pretty fragile (as seems to be the case with all wikis that run their own HTTPd).

c532d4b9e5f09c9f5_1">#!/usr/bin/env python class="Comment"># sakana2moin.py class="Comment"># by dan sandler <dan.sandler /at/ gmail DOTCOM> class="Comment"># converts sakana wikipages into moin wikipage files class="Comment"># usage: sakana2moin.py <srcdir> <destdir> class="Comment"># version: 0.1 an> class="ResWord">import sys, os, re, time an> class="ResWord">from MoinMoin import wikiutil /span> class="ResWord">class Translator: indentstring = ' ' indentlength = len(indentstring) def __init__(self,OUT): self._out = OUT self._inlist = 0 self._indent = "" self._newline = 1 self._preformatted = 0 def indent(self): self._indent = self._indent + Translator.indentstring def outdent(self): self._indent = self._indent[Translator.indentlength:] def output(self, s): if self._newline: self._out.write(self._indent) self._newline = 0 if re.search('\n', s): self._newline = 1 self._out.write(s) def translate(self, IN): self._in = IN while 1: line = self._in.readline() if line == '': break if self._preformatted: line = re.sub(r'\n+$', '', line) else: line = line.strip() pan> # full-line macros m = re.search(r'{list}', line) if m: self._inlist += 1 self.indent() ; continue pan> m = re.search(r'{/list}', line) if m: self._inlist -= 1 self.outdent() ; continue pan> # full-line macros with possible internal text pan> postfix = '\n' m = re.search(r'^@\s*(.*)$', line) if m: line = m.group(1) postfix = ' =\n' self.output("= ") pan> m = re.search(r'^-\s*(.*)$', line) if m: line = m.group(1) if self._inlist: self.output("* ") else: # I feel like sakana's - operator is more like a H3 than a # H2 ... is that just me? postfix = " ===\n" self.output("=== ") pan> # inline formatters pan> chunked = re.split(r'({[^}]*})', line) for chunk in chunked: if len(chunk) > 0 \ and chunk[0] == '{' and chunk[-1] == '}': tag = chunk[1:-1] pan> if tag in ('code','text'): self.output("{{{\n") self._preformatted += 1 continue pan> if tag in ('/code','/text'): self.output("}}}\n") self._preformatted -= 1 continue pan> m = re.search(r'^(part|quote)(.*)$', tag) if m: if m.group(1) == "quote" and m.group(2).startswith("|"): pagename = m.group(2)[1:] if pagename.startswith("(user)"): pagename = pagename[len("(user)"):] self.output("""''Quote from ["%s"]:''""" % pagename) self.indent() self.output("\n") #start the indent continue if tag in ('/part','/quote'): self.outdent() continue if tag == 'b' or tag == 'strong' \ or tag == '/b' or tag == '/strong': self.output("'''") continue if tag == 'i' or tag == 'em' \ or tag == '/i' or tag == '/em': self.output("''") continue if tag == 'tt' or tag == '/tt': self.output("`") if tag == 'tt': self._preformatted += 1 else: self._preformatted -= 1 continue if tag == 'hr': self.output("----") continue if tag == 'br': self.output("[[BR]]") continue m = re.search(r'^verb\|(.*)$', tag) if m: self.output("[[HTML(%s)]]" % m.group(1)) continue m = re.search(r'^(eroom|change|bug)\|(.*)$', tag) if m: if m.group(1) == "eroom": func = "ERoom" elif m.group(1) == "change": func = "Change" elif m.group(1) == "bug": func="Bug" m2 = re.search(r'^(.*)\|(.*)', m.group(2)) if m2: page = m2.group(1) name = ',' + m2.group(2) else: page = m.group(2) name = '' self.output("[[%s(%s%s)]]" % (func, page, name)) continue m = re.search(r'^page\|(.*)$', tag) if m: m2 = re.search(r'^(.*)\|(.*)', m.group(1)) if m2: page = m2.group(1) name = m2.group(2) else: page = m.group(1) name = '' self.output("[:%s:%s]" % (page, name)) continue m = re.search(r'^link\|(.*)$', tag) if m: m2 = re.search(r'^(.*)\|(.*)', m.group(1)) if m2: page = m2.group(1) name = ' ' + m2.group(2) else: page = m.group(1) name = '' self.output("[%s%s]" % (page, name)) continue self.output('`[Unknown macro: %s]`' % chunk) else: # blocks of text # must substitute [OtherPage] pagelink_chunks = re.split(r'(\[[^\]]*\])', chunk) for subchunk in pagelink_chunks: if len(subchunk) > 2 \ and subchunk[0] == '[' and subchunk[-1] == ']': # OK, special pages! page = subchunk[1:-1] m = re.match(r'$user$(.*)$', page) if m: page = m.group(1) #users are regular pages self.output('["%s"]'% page) else: # finally, text with no tags or ANYTHING. self.output(re.sub(r'\\', '', subchunk)) self.output(postfix) class="ResWord">def dehexify(s): name="" for i in range(0,len(s),2): name += ("%c" % int(s[i:i+2], 16)) return name class="ResWord">def hexify(name): s='' for c in name: s += ("%x" % ord(c)) return s class="ResWord">class Snip: def __init__(self, srcdir, name): self.srcdir = srcdir self.name = name filepath = os.path.join(srcdir, hexify(name)) self.info = eval(open(filepath+':meta').read()) if not self.name == self.info["name"]: raise "names not consistent: '%s', '%s'" \ % (self.name, self.info['name']) self.datafile = filepath+':data' def appendToMoinStream(self, outfile): t = Translator(outfile) text = open(self.datafile) t.translate(text) # now, append any notes try: emitted_hdr = 0 for note in self.info["notes"]: noteSnip = Snip(self.srcdir, note) print " note: %s <%s>" % (self.name, self.datafile) if not emitted_hdr: outfile.write("== Comments ==\n") emitted_hdr = 1 else: outfile.write("----\n") noteSnip.appendToMoinStream(outfile) outfile.write("""'' -- ["%s"] at %s''\n""" % ( noteSnip.info["created_by"][len("(user)"):], time.strftime("%X %x", time.localtime(noteSnip.info['created_at'])) )) except KeyError: pass def toMoin(self, destdir): moin_name = self.name if moin_name.startswith("(user)"): moin_name = moin_name[len("(user)"):] # MoinMoin has its own bizarre ideas about how to encode file names moin_name = wikiutil.quoteFilename(moin_name) print "page: %s <%s>" % (self.name, self.datafile) self.appendToMoinStream(open(os.path.join(destdir, moin_name),'w')) class="ResWord">if __name__ == '__main__': if len(sys.argv) < 3: print "usage: sakana2moin.py <srcdir> <destdir>" sys.exit(1) srcdir = sys.argv[1] destdir = sys.argv[2] srcfiles = os.listdir(srcdir) for fn in srcfiles: if fn.endswith(":data"): try: #fp = open(os.path.join(srcdir,fn[:-5]+':meta')) #text = fp.read() #exec("info = " + text) # eww! thanks, swetland #pagename = info["name"] fn = fn[:-5] pagename = dehexify(fn) if pagename.startswith("(note)"): # it will be processed with its parent page continue elif pagename.startswith("(meta)") \ or pagename.startswith("(topic)"): print "warning: skipping '%s' (can't handle meta/topic snips)" % pagename continue Snip(srcdir, pagename).toMoin(destdir) finally: pass class="Comment"># vim: ft=python ts=4 sts=4 sw=4 expandtab: id="bottom">

MoinMoin: SakanaMoinMoinConverter (last edited 2007-10-29 19:15:51 by localhost)