The SearchInPagesAndSort macro

Download this link: SearchInPagesAndSort.py
Instructions: see MacroMarket/SearchInPagesAndSort
The macro:
   1 """
   2 MoinMoin - SearchInPagesAndSort Macro
   3 A line-oriented search macro over multiple pages, with sorting
   4 
   5 Pascal Bauermeister <pascal DOT bauermeister AT hispeed DOT ch>
   6 
   7 Original version:
   8 * [v0.1.0] 2003/04/24 10:32:04
   9     
  10 Updates:
  11 
  12 * [v0.2.4] Pascal Mon Jul 19 23:40:54 CEST 2004
  13   - Comparisons to None use the 'is' and 'is not' operator (nicer)
  14   - Use get() for dict lookup w/ default value
  15   - Do not quote args and retry to compile if they are not valid regexes
  16   - Corrected usage samples in the comment below
  17 
  18 * [v0.2.3] Pascal Sun Jul 18 13:45:46 CEST 2004
  19   Avoid endless recursion when matching page contains this macro
  20 
  21 * [v0.2.2] Fri Jul 16 14:43:23 CEST 2004
  22   - Use Request.redirect(). Thanks to Craig Johnson <cpjohnson AT edcon DOT
  23     co DOT za>
  24     and Thomas Waldmann <tw DASH public AT g m x DOT d e>.
  25   - No more unused imports.
  26   - Catch only expected exceptions.
  27 
  28 * [v0.2.1] Mon Jun  7 11:54:52 CEST 2004
  29   - options: links, heading
  30   - works now with MoinMoin Release 1.2 too
  31       
  32 * [v0.1.1] Wed Oct 29 14:48:02 CET 2003
  33   works with MoinMoin Release 1.1 [Revision 1.173] and Python 2.3.2
  34 
  35 -------------------------------------------------------------------------------
  36 
  37 Usage:
  38   [[ SearchInPagesAndSort (PageRegex, TextRegex, SortKey [,OPTIONS] ) ]]
  39 
  40 Search for TextRegex in pages marching PagesRegex, and sort on
  41   1) SortKey
  42   2) TextRegex
  43   3) found line
  44 
  45 Options: (they are all contained in a coma-separated list of name=value pairs)
  46   links=0 (or 1)   Turns links to page for each hit off or on ; default is on.
  47 
  48   heading=Regex    After each hit, insert the string maching Regex, that
  49                    preceeds the hit in the source page.
  50 
  51   unassigned=text  Header for hits not matching the sort key. Default:
  52                    '[unassigned]'
  53 
  54 
  55 -------------------------------------------------------------------------------
  56 
  57 Sample 1:
  58 
  59   Given a page named 'ProjectA':
  60         1. Action Items
  61           1. [Alan] {2} to launch this task
  62           1. [Alan] {1} to do this urgent thing
  63           1. [Ben][Clara] {3} do this as background task      
  64 
  65         1. Deadlines
  66           1. 2003-03-12 <!> [Alan][Clara]: deliver 1st version of the Release X
  67       
  68   ...and a page named 'ProjectB':
  69         * [Denise] {2} Development of task Xyz
  70         * [Eric] {1} Tests of feature F
  71         * [Eric] (./) Tests of feature E
  72       
  73   ...using the macro in a page named 'ActionItems' like this:
  74         = ActionItems =
  75         [[SearchInPagesAndSort("Project.*","{[123]}","\[[A-Za-z_]*\]")]]
  76       
  77         = Deadlines =
  78         [[SearchInPagesAndSort("Project.*","<!>")]]
  79       
  80         = Completed tasks =
  81         [[SearchInPagesAndSort("Project.*","(\./)","\[[A-Za-z_]*\]")]]
  82       
  83   ...will give this output (note: _text_ are links):
  84         ActionItems
  85           * [Alan]
  86             * [Alan] {1} to do this urgent thing _ProjectA_
  87             * [Alan] {2} to launch this task _ProjectA_
  88           * [Denise]
  89             * [Denise] {2} Development of task Xyz _ProjectB_
  90           * [Ben]
  91             * [Ben][Clara] {3} do this as background task _ProjectA_
  92           * [Eric]
  93             * [Eric] {1} Tests of feature F _ProjectB_
  94           * [Clara]
  95             * [Ben][Clara] {3} do this as background task _ProjectA_
  96     
  97         Deadlines
  98           * 2003-03-12 <!> [Alan][Clara]: deliver 1st version of the Release X
  99             _ProjectA_
 100     
 101         Completed tasks
 102           * [Eric]
 103             * [Eric] (./) Tests of feature E _ProjectB_
 104       
 105 
 106 Sample 2:
 107 
 108   Given a page containing:
 109         == Tasks for (ABC) ==
 110          * {1} (due:2003-12-16) [Mike] Do this
 111         == Tasks for (XYZ) ==
 112          * {2} (due:2003-12-17) [John_Doe][Mike] Do that
 113 
 114   ...the following macro call:
 115         [[SearchInPagesAndSort("MyProjectStatus","{[123]}","\[[A-Za-z_ -]*\]",         "links=0,heading=\([ab]*[0-9][0-9][0-9]\)")]]
 116       
 117   ...will produce:
 118         * [John_Doe]
 119           * {2} (due:2003-12-17) [John_Doe][Mike] Do that (XYZ) 
 120 
 121         * [Mike]
 122           * {1} (due:2003-12-16) [Mike] Do this (ABC)
 123           * {2} (due:2003-12-17) [John_Doe][Mike] Do that (XYZ)
 124 """
 125 
 126 # Imports
 127 import re, sys, cStringIO
 128 from MoinMoin import config, wikiutil
 129 from MoinMoin.Page import Page
 130 from MoinMoin.parser.wiki import Parser
 131 
 132 
 133 # Constants
 134 _arg_page = r'(?P<hquote1>[\'"])(?P<hpage>.+?)(?P=hquote1)'
 135 _arg_text = r'(?P<hquote2>[\'"])(?P<htext>.+?)(?P=hquote2)'
 136 _arg_key  = r'(?P<hquote3>[\'"])(?P<hkey>.+?)(?P=hquote3)'
 137 _arg_opts = r'(?P<hquote4>[\'"])(?P<hopts>.+?)(?P=hquote4)'
 138 _args_re  = re.compile(r'^(%s( *, *%s( *, *%s( *, *%s)?)?)?)?$' %
 139                        (_arg_page, _arg_text, _arg_key, _arg_opts))
 140 
 141 recursions = 0
 142 
 143 def execute(macro, text, args_re=_args_re):
 144 
 145     global recursions
 146     if recursions: return '' ## 'SearchInPagesAndSort(%s)' % text
 147 
 148     recursions += 1
 149     try:     return _execute(macro, text, args_re)
 150     finally: recursions -=1
 151 
 152         
 153 # The "raison d'etre" of this module
 154 def _execute(macro, text, args_re=_args_re):
 155 
 156     # parse and check arguments
 157     args = args_re.match(text)
 158     if text is None or not args:
 159         return ( '<p><strong class="error">Invalid SearchInPages arguments' +
 160                  ' "%s"!</strong></p>' ) % text
 161 
 162     text = args.group('htext')
 163     pages = args.group('hpage')
 164     key = args.group('hkey')
 165     opts = args.group('hopts')
 166      
 167     # get a list of pages matching the PageRegex
 168     pages_re = re.compile(pages, re.IGNORECASE)
 169     all_pages = wikiutil.getPageList(config.text_dir)
 170     hits = filter(pages_re.search, all_pages)
 171     hits.sort()
 172  
 173     if len(hits) == 0:
 174         return (
 175             '<p><strong class="error">'
 176             'No page matching "%s"!</strong></p>' % pages )
 177  
 178     # parse options
 179     options = {}
 180     if opts is not None:
 181         for element in opts.split(','):
 182             pair = element.split('=')
 183             options[ pair[0] ] = pair[1]
 184  
 185     # All these try except could be reduced to a simple get:
 186     opt_links = eval(options.get('links', '1'))
 187     opt_heading = options.get('heading', None)
 188     opt_unassigned_text = options.get('unassigned', "[unassigned]")
 189 
 190     # compile all regex
 191     text_re = re.compile(text, re.IGNORECASE)
 192  
 193     if key is not None:
 194         key_re = re.compile(key, re.IGNORECASE)
 195  
 196     if opt_heading is not None:
 197         heading_re = re.compile(opt_heading, re.IGNORECASE)
 198 
 199     # we will collect matching lines in each matching page
 200     all_matches = []
 201 
 202     # treat each found page
 203     for page_name in hits:
 204         body = Page(page_name).get_raw_body()
 205         pos = 0
 206         last_start = -1
 207         last_end = -1
 208         heading_text = ""
 209         while 1:
 210             keep_line = 1
 211             
 212             # search text
 213             match = text_re.search(body, pos)
 214             if not match: break
 215 
 216             # text is found; now search for heading
 217             if opt_heading is not None:
 218                 heading_pos = pos
 219                 heading_match = True
 220                 # keep the nearest heading to the found text
 221                 while heading_match:
 222                     heading_match = heading_re.search(body, heading_pos)
 223                     if heading_match and heading_match.start() < match.start():
 224                         heading_text = heading_match.group(0)
 225                         heading_pos = heading_match.end()
 226                     else: heading_match = False
 227 
 228             # point to found text
 229             pos = match.end()+1
 230 
 231             # cut before start of line
 232             start_pos = match.start()
 233             rev = 0
 234             while body[start_pos] != '\n' and start_pos:
 235                 start_pos = start_pos - 1
 236                 rev = 1
 237             if rev:
 238                 start_pos = start_pos + 1
 239 
 240             # cut at end of line
 241             end_pos = body.find("\n", match.end())
 242 
 243             # extract line
 244             line = body[start_pos:end_pos].strip()
 245 
 246             # store this record if it differs from previous one
 247             if start_pos == last_start or end_pos == last_end: keep_line = 0
 248 
 249             # store this record if it it is not a comment
 250             elif line.startswith("##"): keep_line = 0
 251 
 252             # remove possible list item leaders
 253             if keep_line:
 254                 for heading in ["*", "1.", "a.", "A.", "i.", "I."]:
 255                     if line.startswith(heading):
 256                         line = line.replace(heading, "", 1)
 257                 line = line.strip()
 258                 if len(line)==0: keep_line = 0
 259 
 260             # handle this record
 261             if keep_line:
 262 
 263                 # find the sort key
 264                 nbmatches = 0
 265                 keypos = 0
 266                 found = 0
 267                 while 1:
 268                     if key is None:
 269                         keyval = ""
 270                     else:
 271                         keymatch = key_re.search(line, keypos)
 272                         if keymatch:
 273                             keyval = line[keymatch.start():keymatch.end()]
 274                             keypos = keymatch.end()
 275                             nbmatches = nbmatches + 1
 276                             found = 1
 277                         else:
 278                             if nbmatches>0: break
 279                             keyval = opt_unassigned_text
 280 
 281                     # store info
 282                     item = []
 283                     item.append(keyval)                          # key text
 284                     item.append(body[match.start():match.end()]) # search text
 285                     item.append(line)                            # line text
 286                     item.append(page_name)                       # page name
 287                     item.append(heading_text)                    # heading
 288                     all_matches.append(item)
 289                     if found == 0: break
 290 
 291                 last_start = start_pos
 292                 last_end = end_pos
 293 
 294         # sort and format records
 295         bullet_list_open = macro.formatter.bullet_list(1)
 296         bullet_list_close = macro.formatter.bullet_list(0)
 297         listitem_open = macro.formatter.listitem(1)
 298         listitem_close = macro.formatter.listitem(0)
 299 
 300         all_matches.sort()
 301         result = ""
 302         result = result+"\n" + bullet_list_open
 303         keyval = ""
 304         head_count = 0
 305 
 306         # treat records for output
 307         for item in all_matches:
 308             text = item[2]
 309             pagename = item[3]
 310             heading_text = item[4]
 311 
 312             # parse the text (in wiki source format) and make HTML,
 313             # after diverting sys.stdout to a string
 314             str_out = cStringIO.StringIO()   # create str to collect output
 315             macro.request.redirect(str_out)  # divert output to that string
 316             # parse this line (this will also execute macros !) :
 317             Parser(text, macro.request).format(macro.formatter)
 318             macro.request.redirect()         # restore output
 319             text_fmtted = str_out.getvalue() # get what was generated
 320             text_fmtted = text_fmtted.strip(' ') # preserve newlines
 321 
 322             # empty text => drop this item
 323             if len(text_fmtted)==0: continue
 324 
 325             # insert heading (only if not yet done)
 326             if key is not None and item[0] != keyval:
 327                 # this is a new heading
 328                 keyval = item[0]
 329                 if head_count:
 330                     result = result+"\n    " + bullet_list_close
 331                     result = result+"\n  " + listitem_close
 332                 head_count = head_count +1
 333                 result = result+"\n  " + listitem_open
 334                 result = result+ keyval
 335                 result = result+"\n    " + bullet_list_open
 336 
 337             # correct text the format (berk)
 338             if text_fmtted.startswith("\n<p>"):
 339                  text_fmtted = text_fmtted[4:]
 340             if text_fmtted.endswith("</p>\n"):
 341                 text_fmtted = text_fmtted[:-5]
 342                 text_trailer = "\n</p>\n"
 343             else: text_trailer = ""
 344                 
 345             # insert text
 346             result = result+"\n      " + listitem_open
 347             result = result + text_fmtted
 348             if opt_links:
 349                 result = result + "&nbsp;&nbsp;&nbsp;<font size=-1>"
 350                 try: # try MoinMoin 1.1 API
 351                     link_text = wikiutil.link_tag(pagename)
 352                 except TypeError: # try MoinMoin 1.2 API
 353                     link_text = wikiutil.link_tag(macro.request, pagename)
 354                 result = result + link_text
 355                 result = result + "</font>"
 356             if opt_heading is not None:
 357                 result = result + "&nbsp;&nbsp;&nbsp;<font size=-1>"
 358                 result = result + heading_text
 359                 result = result + "</font>"                
 360             result = result + text_trailer + "\n      " + listitem_close
 361 
 362         # all items done, close (hopefully) gracefully
 363         if head_count:
 364             result = result+"\n      " + listitem_close
 365             result = result+"\n    " + bullet_list_close
 366         if key is not None:
 367             result = result+"\n  " + listitem_close
 368         result = result+"\n" + bullet_list_close
 369 
 370     # done
 371     return result
SearchInPagesAndSort.py
Comments

(write your feedback below, in the form:
@DATE@ / Name and e-mail address
- comments )
MoinMoin: macro/SearchInPagesAndSort.py (last edited 2007-10-29 19:21:27 by localhost)