The SearchInPagesAndSort macro
Download this link: SearchInPagesAndSort.py
Instructions: see MacroMarket/SearchInPagesAndSort
The macro:
- SearchInPagesAndSort.py
1 """ 2 MoinMoin - SearchInPagesAndSort Macro 3 A line-oriented search macro over multiple pages, with sorting 4 5 Pascal Bauermeister <pascal DOT bauermeister AT hispeed DOT ch> 6 7 Original version: 8 * [v0.1.0] 2003/04/24 10:32:04 9 10 Updates: 11 12 * [v0.2.4] Pascal Mon Jul 19 23:40:54 CEST 2004 13 - Comparisons to None use the 'is' and 'is not' operator (nicer) 14 - Use get() for dict lookup w/ default value 15 - Do not quote args and retry to compile if they are not valid regexes 16 - Corrected usage samples in the comment below 17 18 * [v0.2.3] Pascal Sun Jul 18 13:45:46 CEST 2004 19 Avoid endless recursion when matching page contains this macro 20 21 * [v0.2.2] Fri Jul 16 14:43:23 CEST 2004 22 - Use Request.redirect(). Thanks to Craig Johnson <cpjohnson AT edcon DOT 23 co DOT za> 24 and Thomas Waldmann <tw DASH public AT g m x DOT d e>. 25 - No more unused imports. 26 - Catch only expected exceptions. 27 28 * [v0.2.1] Mon Jun 7 11:54:52 CEST 2004 29 - options: links, heading 30 - works now with MoinMoin Release 1.2 too 31 32 * [v0.1.1] Wed Oct 29 14:48:02 CET 2003 33 works with MoinMoin Release 1.1 [Revision 1.173] and Python 2.3.2 34 35 ------------------------------------------------------------------------------- 36 37 Usage: 38 [[ SearchInPagesAndSort (PageRegex, TextRegex, SortKey [,OPTIONS] ) ]] 39 40 Search for TextRegex in pages marching PagesRegex, and sort on 41 1) SortKey 42 2) TextRegex 43 3) found line 44 45 Options: (they are all contained in a coma-separated list of name=value pairs) 46 links=0 (or 1) Turns links to page for each hit off or on ; default is on. 47 48 heading=Regex After each hit, insert the string maching Regex, that 49 preceeds the hit in the source page. 50 51 unassigned=text Header for hits not matching the sort key. Default: 52 '[unassigned]' 53 54 55 ------------------------------------------------------------------------------- 56 57 Sample 1: 58 59 Given a page named 'ProjectA': 60 1. Action Items 61 1. [Alan] {2} to launch this task 62 1. [Alan] {1} to do this urgent thing 63 1. [Ben][Clara] {3} do this as background task 64 65 1. Deadlines 66 1. 2003-03-12 <!> [Alan][Clara]: deliver 1st version of the Release X 67 68 ...and a page named 'ProjectB': 69 * [Denise] {2} Development of task Xyz 70 * [Eric] {1} Tests of feature F 71 * [Eric] (./) Tests of feature E 72 73 ...using the macro in a page named 'ActionItems' like this: 74 = ActionItems = 75 [[SearchInPagesAndSort("Project.*","{[123]}","\[[A-Za-z_]*\]")]] 76 77 = Deadlines = 78 [[SearchInPagesAndSort("Project.*","<!>")]] 79 80 = Completed tasks = 81 [[SearchInPagesAndSort("Project.*","(\./)","\[[A-Za-z_]*\]")]] 82 83 ...will give this output (note: _text_ are links): 84 ActionItems 85 * [Alan] 86 * [Alan] {1} to do this urgent thing _ProjectA_ 87 * [Alan] {2} to launch this task _ProjectA_ 88 * [Denise] 89 * [Denise] {2} Development of task Xyz _ProjectB_ 90 * [Ben] 91 * [Ben][Clara] {3} do this as background task _ProjectA_ 92 * [Eric] 93 * [Eric] {1} Tests of feature F _ProjectB_ 94 * [Clara] 95 * [Ben][Clara] {3} do this as background task _ProjectA_ 96 97 Deadlines 98 * 2003-03-12 <!> [Alan][Clara]: deliver 1st version of the Release X 99 _ProjectA_ 100 101 Completed tasks 102 * [Eric] 103 * [Eric] (./) Tests of feature E _ProjectB_ 104 105 106 Sample 2: 107 108 Given a page containing: 109 == Tasks for (ABC) == 110 * {1} (due:2003-12-16) [Mike] Do this 111 == Tasks for (XYZ) == 112 * {2} (due:2003-12-17) [John_Doe][Mike] Do that 113 114 ...the following macro call: 115 [[SearchInPagesAndSort("MyProjectStatus","{[123]}","\[[A-Za-z_ -]*\]", "links=0,heading=\([ab]*[0-9][0-9][0-9]\)")]] 116 117 ...will produce: 118 * [John_Doe] 119 * {2} (due:2003-12-17) [John_Doe][Mike] Do that (XYZ) 120 121 * [Mike] 122 * {1} (due:2003-12-16) [Mike] Do this (ABC) 123 * {2} (due:2003-12-17) [John_Doe][Mike] Do that (XYZ) 124 """ 125 126 # Imports 127 import re, sys, cStringIO 128 from MoinMoin import config, wikiutil 129 from MoinMoin.Page import Page 130 from MoinMoin.parser.wiki import Parser 131 132 133 # Constants 134 _arg_page = r'(?P<hquote1>[\'"])(?P<hpage>.+?)(?P=hquote1)' 135 _arg_text = r'(?P<hquote2>[\'"])(?P<htext>.+?)(?P=hquote2)' 136 _arg_key = r'(?P<hquote3>[\'"])(?P<hkey>.+?)(?P=hquote3)' 137 _arg_opts = r'(?P<hquote4>[\'"])(?P<hopts>.+?)(?P=hquote4)' 138 _args_re = re.compile(r'^(%s( *, *%s( *, *%s( *, *%s)?)?)?)?$' % 139 (_arg_page, _arg_text, _arg_key, _arg_opts)) 140 141 recursions = 0 142 143 def execute(macro, text, args_re=_args_re): 144 145 global recursions 146 if recursions: return '' ## 'SearchInPagesAndSort(%s)' % text 147 148 recursions += 1 149 try: return _execute(macro, text, args_re) 150 finally: recursions -=1 151 152 153 # The "raison d'etre" of this module 154 def _execute(macro, text, args_re=_args_re): 155 156 # parse and check arguments 157 args = args_re.match(text) 158 if text is None or not args: 159 return ( '<p><strong class="error">Invalid SearchInPages arguments' + 160 ' "%s"!</strong></p>' ) % text 161 162 text = args.group('htext') 163 pages = args.group('hpage') 164 key = args.group('hkey') 165 opts = args.group('hopts') 166 167 # get a list of pages matching the PageRegex 168 pages_re = re.compile(pages, re.IGNORECASE) 169 all_pages = wikiutil.getPageList(config.text_dir) 170 hits = filter(pages_re.search, all_pages) 171 hits.sort() 172 173 if len(hits) == 0: 174 return ( 175 '<p><strong class="error">' 176 'No page matching "%s"!</strong></p>' % pages ) 177 178 # parse options 179 options = {} 180 if opts is not None: 181 for element in opts.split(','): 182 pair = element.split('=') 183 options[ pair[0] ] = pair[1] 184 185 # All these try except could be reduced to a simple get: 186 opt_links = eval(options.get('links', '1')) 187 opt_heading = options.get('heading', None) 188 opt_unassigned_text = options.get('unassigned', "[unassigned]") 189 190 # compile all regex 191 text_re = re.compile(text, re.IGNORECASE) 192 193 if key is not None: 194 key_re = re.compile(key, re.IGNORECASE) 195 196 if opt_heading is not None: 197 heading_re = re.compile(opt_heading, re.IGNORECASE) 198 199 # we will collect matching lines in each matching page 200 all_matches = [] 201 202 # treat each found page 203 for page_name in hits: 204 body = Page(page_name).get_raw_body() 205 pos = 0 206 last_start = -1 207 last_end = -1 208 heading_text = "" 209 while 1: 210 keep_line = 1 211 212 # search text 213 match = text_re.search(body, pos) 214 if not match: break 215 216 # text is found; now search for heading 217 if opt_heading is not None: 218 heading_pos = pos 219 heading_match = True 220 # keep the nearest heading to the found text 221 while heading_match: 222 heading_match = heading_re.search(body, heading_pos) 223 if heading_match and heading_match.start() < match.start(): 224 heading_text = heading_match.group(0) 225 heading_pos = heading_match.end() 226 else: heading_match = False 227 228 # point to found text 229 pos = match.end()+1 230 231 # cut before start of line 232 start_pos = match.start() 233 rev = 0 234 while body[start_pos] != '\n' and start_pos: 235 start_pos = start_pos - 1 236 rev = 1 237 if rev: 238 start_pos = start_pos + 1 239 240 # cut at end of line 241 end_pos = body.find("\n", match.end()) 242 243 # extract line 244 line = body[start_pos:end_pos].strip() 245 246 # store this record if it differs from previous one 247 if start_pos == last_start or end_pos == last_end: keep_line = 0 248 249 # store this record if it it is not a comment 250 elif line.startswith("##"): keep_line = 0 251 252 # remove possible list item leaders 253 if keep_line: 254 for heading in ["*", "1.", "a.", "A.", "i.", "I."]: 255 if line.startswith(heading): 256 line = line.replace(heading, "", 1) 257 line = line.strip() 258 if len(line)==0: keep_line = 0 259 260 # handle this record 261 if keep_line: 262 263 # find the sort key 264 nbmatches = 0 265 keypos = 0 266 found = 0 267 while 1: 268 if key is None: 269 keyval = "" 270 else: 271 keymatch = key_re.search(line, keypos) 272 if keymatch: 273 keyval = line[keymatch.start():keymatch.end()] 274 keypos = keymatch.end() 275 nbmatches = nbmatches + 1 276 found = 1 277 else: 278 if nbmatches>0: break 279 keyval = opt_unassigned_text 280 281 # store info 282 item = [] 283 item.append(keyval) # key text 284 item.append(body[match.start():match.end()]) # search text 285 item.append(line) # line text 286 item.append(page_name) # page name 287 item.append(heading_text) # heading 288 all_matches.append(item) 289 if found == 0: break 290 291 last_start = start_pos 292 last_end = end_pos 293 294 # sort and format records 295 bullet_list_open = macro.formatter.bullet_list(1) 296 bullet_list_close = macro.formatter.bullet_list(0) 297 listitem_open = macro.formatter.listitem(1) 298 listitem_close = macro.formatter.listitem(0) 299 300 all_matches.sort() 301 result = "" 302 result = result+"\n" + bullet_list_open 303 keyval = "" 304 head_count = 0 305 306 # treat records for output 307 for item in all_matches: 308 text = item[2] 309 pagename = item[3] 310 heading_text = item[4] 311 312 # parse the text (in wiki source format) and make HTML, 313 # after diverting sys.stdout to a string 314 str_out = cStringIO.StringIO() # create str to collect output 315 macro.request.redirect(str_out) # divert output to that string 316 # parse this line (this will also execute macros !) : 317 Parser(text, macro.request).format(macro.formatter) 318 macro.request.redirect() # restore output 319 text_fmtted = str_out.getvalue() # get what was generated 320 text_fmtted = text_fmtted.strip(' ') # preserve newlines 321 322 # empty text => drop this item 323 if len(text_fmtted)==0: continue 324 325 # insert heading (only if not yet done) 326 if key is not None and item[0] != keyval: 327 # this is a new heading 328 keyval = item[0] 329 if head_count: 330 result = result+"\n " + bullet_list_close 331 result = result+"\n " + listitem_close 332 head_count = head_count +1 333 result = result+"\n " + listitem_open 334 result = result+ keyval 335 result = result+"\n " + bullet_list_open 336 337 # correct text the format (berk) 338 if text_fmtted.startswith("\n<p>"): 339 text_fmtted = text_fmtted[4:] 340 if text_fmtted.endswith("</p>\n"): 341 text_fmtted = text_fmtted[:-5] 342 text_trailer = "\n</p>\n" 343 else: text_trailer = "" 344 345 # insert text 346 result = result+"\n " + listitem_open 347 result = result + text_fmtted 348 if opt_links: 349 result = result + " <font size=-1>" 350 try: # try MoinMoin 1.1 API 351 link_text = wikiutil.link_tag(pagename) 352 except TypeError: # try MoinMoin 1.2 API 353 link_text = wikiutil.link_tag(macro.request, pagename) 354 result = result + link_text 355 result = result + "</font>" 356 if opt_heading is not None: 357 result = result + " <font size=-1>" 358 result = result + heading_text 359 result = result + "</font>" 360 result = result + text_trailer + "\n " + listitem_close 361 362 # all items done, close (hopefully) gracefully 363 if head_count: 364 result = result+"\n " + listitem_close 365 result = result+"\n " + bullet_list_close 366 if key is not None: 367 result = result+"\n " + listitem_close 368 result = result+"\n" + bullet_list_close 369 370 # done 371 return result
Comments
(write your feedback below, in the form:
@DATE@ / Name and e-mail address
- comments )