Unify parsers and macros for both simpler user interface and code
See also UnifyMarkup
Contents
Unify Parsers and Macros
After having finished with UnifyParsersAndProcessors, it's time to take a look at parsers and macros.
A macro is an extension plugin that can be called from wiki markup to perform some output depending on arguments given to it. In contrast a parser is an extension plugin that performs some output depending on some parameters and a block of lines given to it.
Conclusion: A macro is a parser that gets an empty block of lines (or _lines=None).
So if we could unify parsers and macros, we get a nice class based api for both types of extension plugins: Macro and Parser.
We are dealing with two different problems: the code handling plugins and the wiki markup. Unifying the code that handle both macros and parsers and plugins in general is good, but not related to the markup. Macro and parser are not the same - parser has content.
- Who says that a macro couldn't have content? If we can handle it by the same code, there is not much point making a different user interface for it and speaking of 2 different thing "macros" and "parsers". And less items are easier for users to learn, so instead 2 things parsers,macros, we only will have 1 item (no matter how we decide to call that thing).
We can use one namespace for macros,parsers e.g. PythonParser, CPPParser, FullSearch - makes sense as all are page extensions and doing the same thing "process some text and output some page content". OTOH actions, formatters and themes are wiki engine extensions.
I am not sure if actions in some cases wouldn't be page extensions. Because we could write user interfaces which do interact with the page body. And an action could be called from a macro or parser too. So you could write a macro which would start an user interface e.g. NewPage. We have two kinds of action macros now. One are of the type of DeletePage and the other one like the new SlideShow. -- ReimarBauer 2005-04-29 21:18:21
The name of the thing
Ongoing GoogleSoc2008 work of BastianBlank implements a MoinMoin.converter2 package. converter2 because we already have a converter package for the gui editor that follows a different api and goal. Nevertheless, if we speak of a converter below, we mean converter2.
Discussion
How will the converter markup look like?
Current 1.6/1.7 markup
<<ExtensionName(arguments)>> # current 1.6/1.7 macro syntax, # arguments may be processed by argument parser magic {{{#!ExtensionName arguments # current 1.6/1.7 parser syntax, line # not much argument processing line line }}} Note: includes some magic so that more than 3 { also work (for nesting).
We could just keep that markup, parse it and map it to a converter call:
Macro: converter(arguments, _lines=None) Parser: converter(arguments, _lines=[line, line, ...])
We should process both argument lists by same argument parser.
PluginBase Class
class PluginBase:
# file extensions this plugin may apply to...
_extensions = None
# mimetypes this extension may apply to...
_mimetypes = None
# they are either None (== no parser like extension)
# or a list of strings (extensions or mimetype globs)
# what dependencies (caching) does this plugin have?
_dependencies = ["time"]
# how to parse the arguments for this plugin?
_parameters = None
def __init__(self, _request, _lines, **kw):
# underscore, so we won't collide with kwargs
self._request = request
self._lines = lines
# just save the kwargs as self attributes
for key in kw:
setattr(self, key, kw[key])
def help(self):
''' Describe what your plugin can do.
The base class can have code to produce automatic help from
available instance attribtues.
'''
Plugin Protocol
Instead of defining calls that raise, or return nothing, we can define a informal protocol. Plugins that want to answer this calls from the parser will implement them. We can do the same by puting these methods in the base class, but that can make it harder when the base class return something that you don't want. If we define raise NotImplementedError for each such method, the poor developer that sub class this will have to override ALL methods which she does not care about, just to remove that raise.
That was an example. If a raise or a return value is the best has to be determined. But defining a base class which is not a base class but only a documentation is kind of stupid. Why do we use a object oriented language at all, if we don't use its features? -- OliverGraf 2004-11-05 09:44:44
- This was only an example. You could define the protocol without a class, its just a lilst of methods. ObjC has a formal protocol syntax, but many protocols are not formal, and just describe as interface. Formal protocols are used when time can be saved by testing if an object conform to a protocol, instead of checking for each method separately. Informal protocols used when object might implement some methods, and we don't care what. Python lack the concept of "protocol" or "interface", but its versatile enough to use the concept.
In the case of help(), its is useful if the base class can produce some automatic help for you, base on your methods docstrings and names, so it's better to have this in a base class. Same for automatic parsing code of arguments etc.
By using a protocol, any object can be a plugin. parser and plugins just have to use the same protocol, they don't care what is the object type, just what protocol methods it likes to implement.
1 """ Draft for Informal protocol for plugins.
2
3 When you write your plugin, copy and pase the methods
4 prototoyes from here.
5
6 This protocol might be extended in future versions. You don't have
7 to implement any of these mehthods - just what make sense for your
8 plugin.
9 """
10
11 def executeInParagraph(self, formatter):
12 """ What the plugin should do in a pargragrph context
13
14 Example:
15 This is text [[ThisMacro]] more text
16
17 In this context, the parser will call you with
18 executeInParagraph(). If you don't implement this, the
19 parser will ignore you. You can return an error message
20 if it does not make sense to put the macro in this
21 context.
22
23 @rtype: unicode
24 @return: formatted content
25 """
26
27 def executeInPage(self, formatter):
28 """ What the plugin should do in a page context
29
30 Example:
31 This is a wikii paragraph
32
33 [[ThisMacro]]
34
35 Another paragraph...
36
37 In this context, the parser will call you with
38 executeInPage(). If you don't implement this, the
39 parser will ignore you. You can return an error message
40 if it does not make sense to put the macro in this
41 context.
42
43 @rtype: unicode
44 @return: formatted content
45 """
How the parser can use such object?
1 def _repl_plugin(self, text):
2 # Get plugin instance (using oliver/fabi parser and import plugin)
3 plugin = self._getPlugin(text)
4 if in_p:
5 execute = getattr(plugin, 'executeInParagrph', None)
6 else:
7 execute = getattr(plugin, 'executeInPage', None)
8
9 if execute:
10
11 # the parser does not have to open a pargraph or div or
12 # whatever for a plugin. The plugin writer is responsible for
13 # correct html element in either a page context or paragraph
14 # context.
15
16 output = execute(self.formatter)
17 self.request.write(output)
18 else:
19 # Maybe error message, or plugin markup in red, or just ignore
markup conversion
We should have some markup conversion rules ready for a mig script:
old syntax (parser, processor or macro) ---- new syntax (PPM)
{{{ plaintext }}} ---- `plaintext` # hmm, does this line show we need 2 quoting methods anyway? # needs backtick_meta = 1 hardcoded
{{{ ... pre section ... }}} ---- {{{ # no specification = default to Parser(text/plain, addpre=1) ... pre section ... }}}
{{{#!python ... py src }}} ---- {{{#!python # or PythonParser !? - shortcut for Parser(python) / Parser(text/x-python) ... py src ... }}}
<<BR>> ---- <<BR>> # implicitely this is BR()
<<OtherMacro(arg1,arg2)>> ---- <<OtherMacro(arg1,arg2)>>
Syntactic sugar
Some parsors/processors have its own arguments. For example bibtex in ParserMarket and SimpleTable in ProcessorMarket.
{{{!#SimpleTable 4 ... contents ... }}} ---- {{{SimpleTable 4 ... contents ... }}}
Or we can allow another syntax for arguments.
{{{SimpleTable(4) ... contents ... }}}
We can extend this idea to macros.
<<OtherMacro(arg1,arg2)>> ---- <<OtherMacro arg1 arg2>> or <<OtherMacro(arg1,arg2)>>
And for arguments, we can design an escape mechanism.
{{{Macro1(arg1,second arg with space)}}} {{{Macro2 arg1 second,arg,with,comma}}} {{{Macro3("arg 1", "arg 2, with comma")}}}
I think we should always go for quoted argument lists. My new OliverGraf/ArgumentParser already supports python like argument lists, with some extensions for bools. -- OliverGraf 2005-03-11 06:55:27
If we do quoted argument lists, then lets use SGML like attributes, which are easier to use then positional arguments, better known then Python conventions, and will allow us to merge css attributes naturally, e.g. let us set a css class to a wiki element.
{{{TableOfContents level="2" class="sidebar"}}} {{{ParserName line-numbers="on" class="folding" Text... }}}
Would make argument lists longer, but it is still well defined, so I have no Problem with this. Making just the OP(,) in the parser optional would accept SGML like attribute lists as well as python style args including positional arguments, but I think allowing this mix would result in more confusion. So we need to settle for one syntax. -- OliverGraf 2005-03-11 11:32:50
In any case the syntax should not forbid using " or ' in the arguments like the search engine currently does. -- AlexanderSchremmer 2005-03-11 16:27:36
With the converter implementation I currently have the problem that I'm unable to parse the arguments without context. This would make the implementation a lot easier. -- BastianBlank 2009-02-19 12:45:20
{{ }} for "include"?
While looking for new ideas on another wiki, I found that they use {{imgage.png}} for showing ("including") images.
I thought that {{ItemName or URL}} could in general mean "include the content from that place here", while [[ItemName or URL]] means "link to that content".
That conflicts with usage of {{text}} for plain / pre / verbatim text (see section above), but maybe we could just define that default to that "include" processing and choose something more verbose for verbatim.
On first look it also isn't quite intuitive why "include" is similar to PPM (or why it is even the default of it), but on the second look one notices, that this just means "include the output of that PPM here" or "include the output we get when doing default processing on the mimetype we find there", so this is not that far-fetched.
It would be further easy to think of {{ as taking single character "parameters" so that {{{ possibly used for extensions, simply means to the human wikizen, "an included thing, that had code do some of the work". Other {{ or [[ thingies are easily imagined then, e.g. [[*icon* link]] mark the link special somehow, {{| big box around my inclusion |}} {{[lang] run parser on my attachment for display }} ... ok, so these are not consistent where the "end mark" of the inner parameter is, but the human sees the usability much better than cumbersome {{Include(pagename)}} {{Parser(language,linkname)}} -- why should wikizens need to know what we call these thing? They should be able to concentrate on the effects.
The markup must be clear and handle all the hard cases, like page or file names with spaces. Therefore, {{:pagename label}} does not work. {{ThisName}} is also not clear; does it run ThisName macro, or include ThisName page content, or include ThisName image?
Remember that part of this is how code behaves while part of this is how humans will behave when they see it! To code, {{: and {{= are not the same thing. They might use a same baseclass, perhaps.
- Exactly, people should use clear and consistent markup. The stuff they use many times, like page names, should be short and easy to type, the stuff they use less can be more verbose.
All markup can use the same Smalltalk-like syntax:
{{object:value variable:value}}
Which happen to be like our acl syntax:
#acl: Name:rightlist AnotherName:rightlist
And like our search syntax:
title:This regex:That AnotherTerm
Where object can be http, page, image, and plugin which are a Renderer, that render the contents of the resource in the page. All of them can be implemented as macros which can be overridden by private wiki macros. So, if someone want to add custom icons for http links, he can simply use its own dancing and singing http renderer.
making wandering wikizens have to know all those things about how it works under the hood is crazy. We should want them to type less and concentrate on their real topic, not on how moin works! -- HeatherStern
Why do you think the users have to know about that? they will continue to use http://example.com just like they did before, they will not have any idea that http is just a plugin like UserPreferences. This stuff is important only to the people who want to customize moin to their needs, without any effort on our side.
Here how common markup will look like:
{{http://example.com/image.png title:Example com amazing image}} - an image with tooltip
{{page:Page Name}} - include all page content
[[page:Page Name]] - link to local page, equivalent to [[Page Name]]
[[page:Page Name title:Different Title]] - link with title
[[page:Page Name image:Image.png]] - link with an image
[[page:Page Name title:Different Title image:Image.png]] - link with an image and text
{{page:Page Name from:"^= Here =$" to:"^= There =$"}} - include part of a page
{{wiki:WikiName/PageName}} - include a page from another wiki
[[wiki:WikiName/PageName]] - link to a page from another wiki (current markup)
- both confusable with Page/Subpage syntax
or [[wiki:WikiName page:PageName]] - consistent, little more typing
or [[wiki:WikiName:PageName]] - without page: key, little strange.
or just use a URL like scheme: wiki://WikiName/Page/SubPage
- Advantages: easy to parse, a well known syntax, no confusion anywhere
- But then why page: and plugin: does not use //? It would be very strange to use plugin://name...
- Advantages: easy to parse, a well known syntax, no confusion anywhere
{{plugin:UserPrefernces}} - macro call
{{plugin:Python lineNumbers:off \n code...}} - code area using python parser
We can use one namespace as the default, maybe the pages namespace, as its the most used namespace in a wiki. Variables names can be omitted to save typing: [[:Page Name :Different Title]] - this syntax already exists: [:Page Name :Different Title] -> Different Title.
If attached image is just like a regular page, for example, saved as SomePage/ImageName.png, then including an image is just {{SomePage/ImageName.png}}, and a link to that image is [[SomePage/ImageName.png]]
This will simplify the parser, as there will be no more magic words to parse. Only text, ''text style'', = heading =, * list, #. ordered list, [[link]] and {{include}}. Both include and link will use same parser for the content.
Also rename MoinMoin to Moin:Moin
I do not like it because it feels kinda bloated. {{plugin:BR}} It rather looks like a parser simplification than a syntax simplification because the long prefixed macro/parser calls are not very intuitive.
The name PPM does describe the techniques are used to write the code but normally I use the words extension or plugin or module to describe someone who does like to use the wiki and its PPMs. If the code is unified we should use in my eyes a common name like firefox and others are doing. Just call it extension -- ReimarBauer 2005-11-19 07:25:54
- The problem just calling it "extension" is that this is too generic. This rather describes the class of this thing, but not what it exactly does. And there are other types of extensions/plugins: actions, themes, maybe soon languages, server adaptors (Request*), authentication plugins, ...
Just an other idea for the syntax invoking a parser or macro call.
{[ImageLink(arguments)]}
or
{[python(arguments) text ]}
Then it is pretty clear that it is a new synatx and for some time the old syntax could be used for comtibility issues. In our cases we may be won't be able to update all wikiservers at the same time. -- ReimarBauer 2006-09-05 10:16:33
How can both types in practise handle arguments
macro with arguments
macros since 1.7 can be written using the argument parser. e.g.
- Example
<<HelloWorld(color=red)>>
parser with arguments
some parser have parameters and they can benefit also from the argument parser. The same example as parser:
1 # HelloParser example code
2 from MoinMoin import wikiutil
3
4 parser_name = __name__.split('.')[-1]
5
6 def macro_settings(color=(u'red', u'blue')):
7 return locals()
8
9 class Parser:
10 """ parser """
11 extensions = '*.xyz'
12 def __init__(self, raw, request, **kw):
13 self.pagename = request.page.page_name
14 self.raw = raw
15 self.request = request
16 self.form = None
17 self._ = request.getText
18
19 args = kw.get('format_args', '')
20 self.init_settings = False
21 # we use a macro definition to initialize the default init parameters
22 # if a user enters a wrong parameter the failure is shown by the exception
23 try:
24 settings = wikiutil.invoke_extension_function(request, macro_settings, args)
25 for key, value in settings.items():
26 setattr(self, key, value)
27 # saves the state of valid input
28 self.init_settings = True
29 except ValueError, err:
30 msg = u"Parser: %s" % err.args[0]
31 request.write(wikiutil.escape(msg))
32
33 def render(self, formatter):
34 """ renders """
35 return self.request.formatter.text("HelloWorld",
36 style="color:%s" % self.color)
37
38 def format(self, formatter):
39 """ parser output """
40 # checks if initializing of all attributes in __init__ was done
41 if self.init_settings:
42 self.request.write(self.render(formatter))
- Example
{{{#!HelloParser color=red }}}
The parser uses invoke_extension_function(request, macro_settings, args) to verify arguments. In current definition of a parser we can not directly exit it at the initialization if one enters an unknown parameter or a wrong value. Therefore we use the status var self.init_settings in format.
How to call a parser as macro
In difference to a parser a macro can be called in textflow e.g. in a table.
Parsers which are known to have a macro as wrapper too:
- latex
- text_x_arnica
This is a helper macro which makes it easy to have one macro to wrap arround parsers which sometimes wanted to be called in textflow.
1 # -*- coding: iso-8859-1 -*-
2 """
3 MoinMoin - macro Parser
4
5 This macro is used to call Parsers,
6 it is just a thin wrapper around it.
7
8 @copyright: 200X by MoinMoin:JohannesBerg
9 2008 by MoinMoin:ReimarBauer
10 @license: GNU GPL, see COPYING for details.
11 """
12 from MoinMoin import wikiutil
13
14 def macro_Parser(macro, parser_name=u'', div_class=u'nonexistent', _kwargs=unicode):
15 if not _kwargs:
16 _kwargs = {}
17
18 # damn I know that this can be done by one command, someone can change it please.
19 args = []
20 for key, value in _kwargs.items():
21 args.append("%s=%s" % (key, value))
22
23 try:
24 Parser = wikiutil.importPlugin(macro.request.cfg, 'parser', parser_name, 'Parser')
25 except wikiutil.PluginMissingError:
26 return macro.formatter.text('Please install "%(parser_name)s"!' % {"parser_name": parser_name})
27
28 the_wanted_parser = Parser("", macro.request, format_args=','.join(args))
29 if the_wanted_parser.init_settings:
30 return '<div class="%(div_class)s"> %(result)s </div>' % {"div_class": div_class,
31 "result": the_wanted_parser.render(macro.formatter)}
- Example
<<Parser(HelloParser, color=blue)>>
-- ReimarBauer 2008-06-28 20:42:12
Unifying macros, parsers, page format processing instructions
Just to document some more ideas of BastianBlank and me. -- ThomasWaldmann 2009-03-04 23:44:36
Macro: {{{#!macroname params}}} Parser (inline): {{{#!parsername params|content}}} Parser (block): {{{#!parsername params content }}} Full page: ----------------- start of page (implicitly working like {{{ does) #!parsername params content ----------------- end of page (implicitly working like }}} does)
As you see, it is all about the same thing!
Modifications:
- use something else than curly braces (must allow nesting somehow)?
as we use [[ and {{ already for links and transclusion, maybe use << (1.6+ macro style)
instead of parsername param param param use parsername(param, param, param)?
Even simpler:
for full page define <start-of-page>#! to be equivalent to {{{ and <end-of-page> to be equivalent to }}}
for everything other than a full page, drop #! requirement
- less to type
Because you can often call a parser/macro on a page and usually you don't want to set always the same parameters, e.g. columns=0,show_text=False for arnica. I think we should have per page params which can be overwritten by parser arguments. parsers/macro should check pragma definitions of their parameters on a page. And use the values as default. e.g.
#pragma columns 0 #pragma show_text False
This also means refactoring and migration of may be duplicated different written pragma definitions.