Attachment 'LookupPagesAndSort.py'
Download 1 """
2 MoinMoin - LookupPagesAndSort Macro
3 A line-oriented search macro over multiple pages, with sorting
4
5 @copyright: Jonas Smedegaard <dr@jones.dk>
6 @license: GPL
7
8 Based heavily on SearchInPagesAndSort
9 by Pascal Bauermeister <pascal DOT bauermeister AT hispeed DOT ch>
10
11 Updates:
12 * [v0.3.4.3] Jonas Wed Nov 23 15:22:27 CET 2005
13 * Ignore empty LookupText values (continue loop rather than break).
14 * Adjust a variable name for a smaller SearchInPlacesAndSort diff.
15 * Allow caching of the page (let's see if it causes any trouble).
16 * Correct references to name of script itself.
17 * Replace examples with ones that make better sense.
18 * New feature: Pages can now be a group-page lookup: Prepend a "+".
19
20 * [v0.3.4.2] Jonas Fri Nov 18 17:27:01 CET 2005
21 * Replace SearchString (regexp) with LookupString (dict)
22 * Simplify heading_text and keyval loops (always one per page now)
23 * Decode UTF-8 input in regexp
24
25 * [v0.3.4.1] Jonas Fri Nov 18 17:03:26 CET 2005
26 * Add dict lookup. Syntax: @PN?Definition@
27 * Drop NbSubs and MoreSubsText support
28
29 * [v0.3.4] Pascal Sat Mar 5 17:53:08 CET 2005
30 * MoinMoin 1.3.x _and_ 1.2.x compatible
31 * Added arguments: Format, HeaderFormat and FormatSort
32
33 * [v0.3.3] Pascal
34 * Fixed a security hole (eval used for arguments parsing)
35 * Added argument: ExcludePages=regex
36
37 * [v0.3.2] Pascal
38 * Use StringIO instead of cStringIO, for unicode compatibility
39
40 * [v0.3.1] Pascal Sat Nov 6 16:03:01 CET 2004
41 * Added NoText, RawText, NbSubs and MoreSubsText arguments
42
43 * [v0.3.1] Pascal Mon Aug 30 21:27:36 CEST 2004
44 * Corrected bug: did not work well with multiple pages hit.
45 Bug reported by Craig Johnson.
46 It worked in 0.2.x because one bug corrected another one...
47 * If args are not a kw list (e.g. old macro form) inserts usage in html
48 page (brutal, but we really don't want to support the old form any more)
49
50 * [v0.3.0] Pascal Wed Aug 18 15:39:54 CEST 2004
51 * macro arguments are now passed as a list of KEYWORD=VALUE
52 * ACL is handled
53 * new options: Reverse and NoHeader
54
55 * [v0.2.4] Pascal Mon Jul 19 23:40:54 CEST 2004
56 * Comparisons to None use the 'is' and 'is not' operator (nicer)
57 * Use get() for dict lookup w/ default value
58 * Do not quote args and retry to compile if they are not valid regexes
59 * Corrected usage samples in the comment below
60
61 * [v0.2.3] Pascal Sun Jul 18 13:45:46 CEST 2004
62 Avoid endless recursion when matching page contains this macro
63
64 * [v0.2.2] Fri Jul 16 14:43:23 CEST 2004
65 * Use Request.redirect(). Thanks to Craig Johnson <cpjohnson AT edcon DOT
66 co DOT za>
67 and Thomas Waldmann <tw DASH public AT g m x DOT d e>.
68 * No more unused imports.
69 * Catch only expected exceptions.
70
71 * [v0.2.1] Mon Jun 7 11:54:52 CEST 2004
72 * options: links, heading
73 * works now with MoinMoin Release 1.2 too
74
75 * [v0.1.1] Wed Oct 29 14:48:02 CET 2003
76 works with MoinMoin Release 1.1 [Revision 1.173] and Python 2.3.2
77
78 * [v0.1.0] 2003/04/24 10:32:04
79 Original version
80
81 ----
82
83 Usage:
84 [[ LookupPagesAndSort ]]
85 [[ LookupPagesAndSort (KEYWORD=VALUE [, ...] ) ]]
86
87 Lookup 'lookuptext' dict definitions in pages matching 'pages' regex, and
88 sort the found lines (=hits) in this order:
89 1) substring of the hit matching 'sortkey'; group same matches of
90 'sortkey' by a header
91 2) substring of the hit matching 'lookuptext'
92 3) the hit itself
93
94 If no arguments are given, the usage is inserted in the HTML result.
95 Possible keywords:
96
97 Help = 0, 1, 2 Displays 1:short or 2:full help in the page.
98 Default: 0 (i.e. no help).
99
100 Pages = 'PAGES REGEX' Pages in which the text is sought. If
101 or empty (default) search in the current page
102 '+PageGroup' and defaults 'NoLinks' to 1. If starting with
103 "+" then a single PageGroup page is looked up.
104 Default: empty (i.e. current page).
105
106 ExcludePages = 'PAGES REGEX' Exclude these pages (i.e. remove these pages
107 from the list collected by 'Pages').
108 Default: empty (i.e. don't exclude any).
109
110 LookupText = 'TEXT DICT' To lookup definition in matching pages.
111 Mandatory!
112
113 SortKey = 'TEXT REGEX' Criterion to sort matching lines (=hits).
114 Default: empty (i.e. no sorting).
115
116 Heading = 'TEXT REGEX' Follow each hit by the text maching Regex,
117 that preceeds the hit in its source page.
118 Default: empty (i.e. no headings).
119
120 UnassignedText = 'WIKI TEXT' Header for hits not matching the sort key.
121 Default: '[unassigned]'.
122
123 Reverse = 0 or 1 Reverse-sort the hits.
124 Default: 0 (i.e. forward sort).
125
126 RawText = 0 or 1 Do not format found text.
127 Default: 0 (i.e. formatted).
128
129 Format = 'STRING' Explicitely format the output using this
130 string, which can contain wiki formatting
131 as well as these tokens:
132 @KT@ : text matching 'SortKey'
133 @LT@ : text matching 'LookupText'
134 @FT@ : line of text
135 @PN@ : page name
136 @HT@ : heading text
137 @@ : the '@' character
138 \\n : newline (of wiki source text).
139
140 Each token can contain a regex acting as
141 a filter for displaying the value, e.g:
142 @FT:{[123]}@ displays the prio smiley
143
144 Multiple groups can be defined, in which
145 case the text matching them will be
146 displayed, e.g:
147 @FT:{[123]}(.*)@ displays text after prio
148
149 Default: '' (i.e. auto-formatting).
150
151 HeaderFormat = 'STRING' If specified, use this instead of 'Format'
152 for headers.
153 Default: '' (i.e. do not display headers).
154
155 FormatSort = 0 or 1 If 1, sort the output generated by 'Format'
156 (if 'Reverse' is 1, reverse-sort). If 0,
157 leave the output sorted by the 'SortKey'
158 criterion (if specified).
159 Default: 0 (i.e. no sorting).
160
161 NoHeader = 0 or 1 Disable showing the headers as subtitles.
162 Default: 0 (i.e. show headers).
163
164 NoLinks = 0 or 1 Disable following each hit by a link to its
165 page.
166 Default: 0 (i.e. show links) or 1 if
167 'Pages' is omitted.
168
169 NoPageText = 'HTML TEXT' Text displayed if no page match 'Pages'.
170 Default: an error message w/ Page regex
171
172 NoText = 0 or 1 Disables showing the found text.
173 Default: 0 (i.e. show found text).
174
175 Keywords can be also given in upper or lower cases, or abbreviated.
176 Example: LookupText, lookuptext, LOOKUPTEXT, lt, LT, Pages, p, etc.
177
178 ----
179
180 Sample 1:
181
182 Given a page named 'AnInterestingBook':
183 = A rather interesting Book =
184 == Bibliographical facts ==
185 Title:: A rather interesting Book
186 Author:: A. Man
187 Publisher:: Cool Publishing Corp.
188 == Comments ==
189 I really think that this book is worth a read.
190
191 I'd even wanna lend out my copy if needed!
192
193 == Status ==
194 Owner:: Jonas Smedegaard
195 Availability:: Lend out to Jack the Ripper
196
197 ...and a page named 'AnotherInterestingBook':
198 = Another interesting Book =
199 == Bibliographical facts ==
200 Title:: Another interesting Book
201 Author:: A. Man
202 Publisher:: Cool Publishing Corp.
203 == Comments ==
204 This is the sequel to AnInterestingBook - also worth a read.
205
206 == Status ==
207 Owner:: Jonas Smedegaard
208 Availability:: Available - call me if interested in lending it
209
210 ...and a page named 'AnotherBoringBook':
211 = A boring Book =
212 == Bibliographical facts ==
213 Title:: A pretty boring Book
214 Author:: Some Fool
215 Publisher:: Lousy Publishing Corp.
216 == Comments ==
217 Don't waste time on this book.
218
219 I was stupid enough to buy it once, but won't even lend it out!
220
221 == Status ==
222 Owner:: Jonas Smedegaard
223 Availability::
224
225 ...and the wiki setup to include books as dict pages:
226 page_dict_regex = u'[a-z0-9](Book|Dict)$'
227
228 ...using the macro in a page named 'BookOverview' like this:
229 = Known books =
230 [[LookupPagesAndSort(pages=".*Book$", lookuptext="Title")]]
231
232 = Book availability =
233 [[LookupPagesAndSort(pages=".*Book$", lookuptext="Availability")]]
234
235 ...will give this output (note: _text_ are links):
236 Known books
237 * A. Man
238 * A rather interesting Book _AnInterestingBook_
239 * A rather interesting Book _AnotherInterestingBook_
240 * Some Fool
241 * A pretty boring Book _AnotherBoringBook_
242
243 Book Availability
244 * Lend out to Jack the Ripper _AnInterestingBook_
245 * Available - call me if interested in lending it _AnotherInterestingBook_
246
247
248 Sample 2:
249
250 Given a page /MyDict containing:
251 == Contact info ==
252 FirstName:: Jonas
253 FullName:: Jonas "dr. Jones" Smedegaard
254 Phone:: +45 40843136
255 Email:: dr@jones.dk
256 == Photo gallery ==
257 PhotoThumbnail:: http://dr.jones.dk/images/me/kp_bricks_thumb.jpg
258 PhotoPortrait:: http://dr.jones.dk/images/me/kp_bricks.jpg
259
260 ...the following macro call in another page:
261 [[LookupPagesAndSort(lookuptext="+WikiEditorsGroup", DictPage="/MyDict", LookupText="Email", Format=" * @PN?PhotoThumbnail@ [mailto:@PN?Email@ @PN?FirstName@]\\n")]])]]
262
263 ...will produce a list of images and email references for me and all other editors.
264 """
265
266 # Imports
267 import re, sys, StringIO, urllib
268 from string import ascii_lowercase, maketrans
269 from MoinMoin import config, wikiutil, version
270 from MoinMoin.Page import Page
271 from MoinMoin.parser import wiki
272
273 before_1_3 = version.release < '1.3'
274
275 #Dependencies = ["time"] # macro cannot be cached
276
277 _recursions = 0
278 FAKETRANS = maketrans ("","")
279
280
281 class _Error (Exception):
282 pass
283
284
285 def execute (macro, text, args_re=None):
286
287 global _recursions
288 if _recursions: return ''
289
290 _recursions += 1
291 try: res = _execute (macro, text)
292 except _Error, msg:
293 _recursions = 0
294 return """
295 <p><strong class="error">
296 Error: macro LookupPagesAndSort: %s</strong> </p>
297 """ % msg
298
299 _recursions -=1
300 return res
301
302
303 def _delparam (keyword, params):
304 value = params [keyword]
305 del params [keyword]
306 return value.decode("UTF-8")
307
308
309 def _param_get (params, spec, default):
310
311 """Returns the value for a parameter, if specified with one of
312 several acceptable keyword names, or returns its default value if
313 it is missing from the macro call. If the parameter is specified,
314 it is removed from the list, so that remaining params can be
315 signalled as unknown"""
316
317 # param name is litteral ?
318 if params.has_key (spec): return _delparam (spec, params)
319
320 # param name is all lower or all upper ?
321 lspec = spec.lower ()
322 if params.has_key (lspec): return _delparam (lspec, params)
323 uspec = spec.upper ()
324 if params.has_key (uspec): return _delparam (uspec, params)
325
326 # param name is abbreviated ?
327 cspec = spec [0].upper () + spec [1:] # capitalize 1st letter
328 cspec = cspec.translate (FAKETRANS, ascii_lowercase)
329 if params.has_key (cspec): return _delparam (cspec, params)
330 cspec = cspec.lower ()
331 if params.has_key (cspec): return _delparam (cspec, params)
332
333 # nope: return default value
334 return default
335
336
337 def _usage (full = False):
338
339 """Returns the interesting part of the module's doc"""
340
341 if full: return __doc__
342
343 lines = __doc__.replace ('\\n', '\\\\n'). splitlines ()
344 start = 0
345 end = len (lines)
346 for i in range (end):
347 if lines [i].strip ().lower () == "usage:":
348 start = i
349 break
350 for i in range (start, end):
351 if lines [i].startswith ('--'):
352 end = i
353 break
354 return '\n'.join (lines [start:end])
355
356
357 def _re_compile (text, name):
358 try:
359 return re.compile (text, re.IGNORECASE)
360 except Exception, msg:
361 raise _Error ("%s for regex argument %s: '%s'" % (msg, name, text))
362
363
364 last_request_h = None
365 last_pages_list = []
366
367 def _get_all_pages (request):
368 global last_request_h
369 global last_pages_list
370 request_h = hash (request)
371 if request_h != last_request_h:
372 if before_1_3: all_pages = wikiutil.getPageList (config.text_dir)
373 else: all_pages = request.rootpage.getPageList()
374 last_request_h = request_h
375 last_pages_list = all_pages
376 return last_pages_list
377
378
379 # The "raison d'etre" of this module
380 def _execute (macro, text):
381
382 result = ""
383
384 # new args syntax
385 try:
386 params = eval ("(lambda **opts: opts)(%s)" % text,
387 {'__builtins__': []}, {})
388 except Exception, msg:
389 raise _Error ("""<pre>malformed arguments list:
390 %s<br>cause:
391 %s
392 </pre>
393 <br> usage:
394 <pre>%s</pre>
395 """ % (text, msg, _usage () ) )
396
397 arg_text = _param_get (params, 'LookupText', None)
398 arg_pages = _param_get (params, 'Pages', '')
399 arg_excl_pages = _param_get (params, 'ExcludePages', '')
400 arg_dict = _param_get (params, 'DictPage', '')
401 arg_key = _param_get (params, 'SortKey', None)
402
403 opt_heading = _param_get (params, 'Heading', None)
404 opt_unassigned_text = _param_get (params, 'UnassignedText',
405 "[unassigned]")
406 opt_reverse = _param_get (params, 'Reverse', False)
407 opt_rawtext = _param_get (params, 'RawText', False)
408
409 opt_format = _param_get (params, 'Format', '')
410 opt_headerformat = _param_get (params, 'HeaderFormat', '')
411 opt_formatsort = _param_get (params, 'FormatSort', 0)
412
413 def_nolinks = (1,0) [len (arg_pages)>0]
414 opt_nolinks = _param_get (params, 'NoLinks', def_nolinks)
415 opt_noheader = _param_get (params, 'NoHeader', False)
416 opt_notext = _param_get (params, 'NoText', False)
417 opt_nopage = _param_get (params, 'NoPageText', None)
418 opt_help = _param_get (params, 'Help', 0)
419
420 # help ?
421 if opt_help:
422 return """
423 <p>
424 Macro LookupPagesAndSort usage:
425 <pre>%s</pre></p>
426 """ % _usage (opt_help==2)
427
428 # check the args a little bit
429 if len (params):
430 raise _Error ("""unknown argument(s): %s
431 <br> usage:
432 <pre>%s</pre>
433 """ % (`params.keys ()`, _usage () ) )
434
435 if arg_text is None:
436 raise _Error ("missing 'lookuptext' argument")
437
438 # empty page means this page; subpage are also handled
439 if len (arg_pages) == 0 or arg_pages.startswith ('/'):
440 arg_pages = macro.formatter.page.page_name + arg_pages
441
442 # get a list of pages matching the PageRegex
443 all_pages = _get_all_pages (macro.request)
444 if arg_pages [0]=="+":
445 hits = macro.request.dicts.members(arg_pages [1:])
446 else:
447 pages_re = _re_compile (arg_pages, 'Pages')
448 hits = filter (pages_re.search, all_pages)
449 if arg_excl_pages:
450 excl_pages_re = _re_compile (arg_excl_pages, 'ExcludePages')
451 hits = filter (lambda hit: not excl_pages_re.search (hit), hits)
452
453 if before_1_3:
454 # check ACL now (since we may end up with no pages)
455 if config.acl_enabled:
456 me = macro.request.user.name
457 def _check_page (page_name):
458 page = Page (page_name) # too bad we must instanciate...
459 return page.getACL ().may (macro.request, me, "read")
460 hits = filter (_check_page, hits)
461
462 # sort pages, check if we have pages
463 if len (hits) == 0:
464 if opt_nopage: return "%s" % opt_nopage
465 else:
466 raise _Error ("no page matching '%s'!" % arg_pages)
467 else: hits.sort ()
468
469 if arg_key is not None:
470 key_re = _re_compile (arg_key, 'SortKey')
471
472 if opt_heading is not None:
473 heading_re = _re_compile (opt_heading, 'Heading')
474
475 # we will collect matching lines in each matching page
476 all_matches = []
477
478 # treat each found page
479 for page_name in hits:
480 heading_text = ""
481
482 # Set dict page to use for lookups
483 if len (arg_dict) == 0 or arg_dict.startswith ('/'):
484 dict_name = page_name + arg_dict
485 else:
486 dict_name = arg_dict
487
488 # lookup text
489 lookuptext = macro.request.dicts.dict(dict_name).get(arg_text,'')
490 if not lookuptext: continue
491
492 # text is found; now search for heading
493 if opt_heading is not None:
494 heading_match = heading_re.search (lookuptext)
495 if heading_match:
496 heading_text = heading_match.group (0)
497
498 # find the sort key
499 keyval = ""
500 if arg_key is not None:
501 keymatch = key_re.search (lookuptext)
502 if keymatch:
503 keyval = keymatch.group (0)
504 else:
505 keyval = opt_unassigned_text
506
507 # store info
508 item = []
509 item.append (keyval) # key text
510 item.append (lookuptext) # lookup text
511 item.append (page_name) # page name
512 item.append (dict_name) # dict name
513 item.append (heading_text) # heading
514 all_matches.append (item)
515
516 # all pages handled
517
518 # prepare some formatting text
519 bullet_list_open = macro.formatter.bullet_list (1)
520 bullet_list_close = macro.formatter.bullet_list (0)
521 listitem_open = macro.formatter.listitem (1)
522 listitem_close = macro.formatter.listitem (0)
523
524 # now sort and format records
525 if not opt_notext: all_matches.sort ()
526 if opt_reverse: all_matches.reverse ()
527
528 # explicitely-formatted output
529 if opt_format:
530 block = ""
531 last_keytext = None
532 rx = re.compile (r'([^@]*?)(@[^@]*?@)')
533 pairs = re.findall (rx, opt_format+"@-@")
534 if opt_headerformat: hpairs = re.findall (rx, opt_headerformat+"@-@")
535 else: hpairs = None
536 rx2d = {}
537 for item in all_matches:
538 keytext, text, pagename, dict_name, heading_text = item
539 if keytext == last_keytext: plist = (pairs,)
540 elif hpairs: plist = (hpairs, pairs)
541 else: plist = (pairs,)
542 last_keytext = keytext
543 for p in plist:
544 for txt, token in p:
545 txt = txt.replace ("\\n", "\n")
546 if not token: continue
547 token = token.strip ("@")
548 block += txt
549 rx2 = None
550 if len (token)>2 and token [2]=="?":
551 #FIXME: only cut out dict name here, and lookup after (non-hardcoded!) pagename is expanded
552 token = macro.request.dicts.dict(dict_name).get(token [3:],'')
553 if len (token)>2 and token [2]==":":
554 token, rx2 = token [:2], token [3:]
555 if not rx2d.has_key (rx2): rx2d [rx2] = \
556 re.compile (rx2)
557 rx2 = rx2d [rx2]
558 token = token.replace ("\\n", "\n")
559 d = { "KT": keytext, "LT": text,
560 "PN": pagename, "HT": heading_text,
561 "": "@",
562 "-": "",
563 }
564 if rx2:
565 tx = d.get (token, None)
566 if tx:
567 tx = map ("".join, re.findall (rx2, tx)) [0]
568 else: tx = token
569 block += tx
570 else:
571 block += d.get (token, token)
572 if opt_formatsort:
573 lines = block.split ("\n")
574 lines.sort ()
575 if opt_reverse: lines.reverse ()
576 block = "\n".join (lines)
577 result += "\n%s\n" % _format (block, macro.request, macro.formatter)
578
579 # auto-formatted output treat records for output
580 else:
581 head_count = 0
582 result = result+"\n" + bullet_list_open
583 keyval = ""
584 last_pagename = ""
585
586 for item in all_matches:
587 keytext, text, pagename, dict_name, heading_text = item
588
589 if opt_notext:
590 text_fmtted = ""
591 if last_pagename == pagename: continue
592 else: last_pagename = pagename
593 elif opt_rawtext:
594 text_fmtted = wikiutil.escape (text)
595 else:
596 # parse the text (in wiki source format) and make HTML,
597 # after diverting sys.stdout to a string
598 text_fmtted = _format (text, macro.request, macro.formatter)
599 text_fmtted = text_fmtted.strip (' ') # preserve newlines
600
601 # empty text => drop this item
602 if len (text_fmtted)==0: continue
603
604 # insert heading (only if not yet done)
605 if not opt_noheader \
606 and arg_key is not None \
607 and keytext != keyval:
608 # this is a new heading
609 keyval = keytext
610 if head_count:
611 result = result+"\n " + bullet_list_close
612 result = result+"\n " + listitem_close
613 head_count = head_count +1
614 result = result+"\n " + listitem_open
615 result = result+ _format (keyval,
616 macro.request, macro.formatter)
617 result = result+"\n " + bullet_list_open
618
619 # correct the text format (berk)
620 if text_fmtted.startswith ("\n<p>"):
621 text_fmtted = text_fmtted [4:]
622 if text_fmtted.endswith ("</p>\n"):
623 text_fmtted = text_fmtted [:-5]
624 text_trailer = "\n</p>\n"
625 else: text_trailer = ""
626
627 # insert formatted text
628 result = result+"\n " + listitem_open
629 result = result + text_fmtted
630 if not opt_nolinks:
631 result = result + " <font size=-1>"
632 if arg_text:
633 if before_1_3:
634 pageurl = '%s?action=highlight&value=%s' % (
635 pagename,
636 urllib.quote_plus (re.escape (text)))
637 else:
638 pageurl = '%s?highlight=%s' % (
639 pagename,
640 urllib.quote_plus (re.escape (text)))
641
642 else: pageurl = wikiutil.quoteWikiname (pagename)
643 link_text = wikiutil.link_tag (macro.request,
644 pageurl, pagename)
645
646 result = result + link_text
647 result = result + "</font>"
648 if opt_heading is not None:
649 result = result + " <font size=-1>"
650 result = result + heading_text
651 result = result + "</font>"
652
653 result = result + text_trailer + "\n " + listitem_close
654
655 # all items done, close (hopefully) gracefully
656 if not opt_format:
657 if head_count:
658 result = result+"\n " + listitem_close
659 result = result+"\n " + bullet_list_close
660 if not opt_noheader and arg_key is not None:
661 result = result+"\n " + listitem_close
662 result = result+"\n" + bullet_list_close
663
664 # done
665 return result
666
667 def _format (src_text, request, formatter):
668 # parse the text (in wiki source format) and make HTML,
669 # after diverting sys.stdout to a string
670 str_out = StringIO.StringIO () # create str to collect output
671 request.redirect (str_out) # divert output to that string
672 # parse this line
673 wiki.Parser (src_text, request).format (formatter)
674 request.redirect () # restore output
675 return str_out.getvalue () # return what was generated
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.