1 2011-07-24T00:18:35 *** sinha
2 2011-07-24T00:52:19 *** raignarok
3 2011-07-24T01:17:20 <dreimark> Marchael: you should have unittests for this behaviour
4 2011-07-24T01:17:44 <dreimark> so next time you can first run them on a temporary new version of whoosh
5 2011-07-24T01:18:02 <dreimark> and if they don't fail you know that everything you need works
6 2011-07-24T01:18:19 <dreimark> and if some fail you have to verify the reason
7 2011-07-24T01:47:05 <ronny> sup
8 2011-07-24T02:49:33 *** not-xjjk
9 2011-07-24T02:53:45 *** xjjk
10 2011-07-24T03:08:08 *** not-xjjk
11 2011-07-24T03:58:09 <Marchael> moin
12 2011-07-24T04:00:31 <Marchael> dreimark: I don't quite understand your point. Whoosh returns html correcltly, but I don't know why flatland show this markup as text
13 2011-07-24T04:00:55 <Marchael> btw, I have some test for analyzers
14 2011-07-24T04:01:02 <Marchael> *tests
15 2011-07-24T04:21:55 <Marchael> Oh finally, I go markuping
16 2011-07-24T04:22:10 <Marchael> *done highlighting
17 2011-07-24T04:27:40 <Marchael> oh, seems like another whoosh bug
18 2011-07-24T04:28:22 <Marchael> result.docnum gives wrong number
19 2011-07-24T04:53:24 <Marchael> ThomasWaldmann: there is http://i.imgur.com/z17ao.png
20 2011-07-24T04:57:37 <Marchael> ThomasWaldmann: http://codereview.appspot.com/4802048/ Patchset 7,
21 2011-07-24T04:57:45 <ThomasWaldmann> moin :)
22 2011-07-24T04:58:01 <Marchael> hi ThomasWaldmann
23 2011-07-24T04:58:29 <ThomasWaldmann> if you want to enumerate search results, please start counting from 1. this is for normal users, not programmers.
24 2011-07-24T04:58:43 <Marchael> ok
25 2011-07-24T04:59:06 <ThomasWaldmann> and if you find regressions in whoosh, please file bugs for them
26 2011-07-24T05:00:18 <ThomasWaldmann> and it maybe should be
27 2011-07-24T05:00:28 <ThomasWaldmann> Results for your search '...'
28 2011-07-24T05:01:18 <Marchael> ok
29 2011-07-24T05:04:13 * ThomasWaldmann looks at CR
30 2011-07-24T05:04:53 <Marchael> ThomasWaldmann: how can I change "handlers need to give it" on something meaningful?
31 2011-07-24T05:20:47 <ThomasWaldmann> usually there is the itemname
32 2011-07-24T05:21:47 <ThomasWaldmann> review done
33 2011-07-24T05:44:45 <Marchael> ThomasWaldmann: patchset 8 Fixes from previous review
34 2011-07-24T05:46:37 <Marchael> http://i.imgur.com/lOjTk.png
35 2011-07-24T05:48:50 <Marchael> ThomasWaldmann: I merged with your branch, https://bitbucket.org/marchael/moin-2.0/changesets
36 2011-07-24T06:17:01 <Marchael> ThomasWaldmann: ?
37 2011-07-24T08:13:25 *** pkumar
38 2011-07-24T09:10:11 *** raignarok
39 2011-07-24T09:35:12 *** pkumar
40 2011-07-24T10:04:32 *** raignarok
41 2011-07-24T10:21:20 <dreimark> Marchael: I assume none of your tests fails after the update of whoosh
42 2011-07-24T10:51:04 <Marchael> dreimark: yes
43 2011-07-24T10:51:12 <Marchael> tests really don't fails
44 2011-07-24T10:51:41 <Marchael> I just forgot what for finding "Homeland" I'm need use "Home*"
45 2011-07-24T10:52:01 <Marchael> but Home* fails :)
46 2011-07-24T10:52:27 <Marchael> I see traceback each time when I use it
47 2011-07-24T11:03:25 <Marchael> ThomasWaldmann: when you get some free time pleae, review my code, I want to commit this.
48 2011-07-24T11:03:38 <Marchael> *please
49 2011-07-24T11:08:11 *** raignarok
50 2011-07-24T11:43:18 <ThomasWaldmann> Marchael: review done
51 2011-07-24T11:47:12 * Marchael looks
52 2011-07-24T11:50:24 <Marchael> ThomasWaldmann: http://codereview.appspot.com/4802048/ patchset 9
53 2011-07-24T12:03:33 <ThomasWaldmann> reviewed
54 2011-07-24T12:04:09 <ThomasWaldmann> btw, i also added information that was missing to the bug report you submitted to whoosh
55 2011-07-24T12:04:16 <Marchael> may be use .5f?
56 2011-07-24T12:04:20 <Marchael> ok, thx
57 2011-07-24T12:04:59 <ThomasWaldmann> 5 is a unusual amount
58 2011-07-24T12:05:20 <Marchael> hm, ok
59 2011-07-24T12:05:54 <ThomasWaldmann> one usually has 1000x steps in units
60 2011-07-24T12:06:55 <ThomasWaldmann> (== steps by 3 in the exponent)
61 2011-07-24T12:07:38 <Marchael> ok, done. may i commit?
62 2011-07-24T12:08:05 <ThomasWaldmann> if you did deal with all i noted in my reviews...
63 2011-07-24T12:09:52 <Marchael> you told me what I don't merging, but I do
64 2011-07-24T12:10:15 <Marchael> you can see link on my changesets above
65 2011-07-24T12:11:30 <ThomasWaldmann> well, i was mostly wondering why this stuff gets into the codereview diffs
66 2011-07-24T12:12:52 *** raignarok
67 2011-07-24T12:23:26 <CIA-114> Michael Mayorov <marchael@kb.csu.ru> * f5dd44e2f1fe r353 moin-2.0/ (6 files in 4 dirs):
68 2011-07-24T12:23:26 <CIA-114> - Update whoosh to 2.0
69 2011-07-24T12:23:26 <CIA-114> - Search result template
70 2011-07-24T12:23:26 <CIA-114> - Search in headings
71 2011-07-24T12:23:26 <CIA-114> - Show search statistics
72 2011-07-24T12:23:26 <CIA-114> - Highlighting headings of found documents
73 2011-07-24T12:24:38 <Marchael> ok, now i try to use convertor
74 2011-07-24T12:24:39 *** raignarok
75 2011-07-24T12:25:36 <ThomasWaldmann> summary: - Update whoosh to 2.0
76 2011-07-24T12:26:38 <ThomasWaldmann> you always should have the first line of your commit comment as some simple text (then follow it by an empty line, then add more text - if needed)
77 2011-07-24T12:27:15 <ThomasWaldmann> so, don't start a list of stuff in the first line.
78 2011-07-24T12:27:20 <ThomasWaldmann> Marchael:
79 2011-07-24T12:27:23 <Marchael> ok
80 2011-07-24T12:31:00 *** raignarok
81 2011-07-24T12:31:49 <ThomasWaldmann> Marchael: Results for search 'home'
82 2011-07-24T12:31:49 <ThomasWaldmann> 1. Home
83 2011-07-24T12:31:49 <ThomasWaldmann> Revision: 1 Last Change: 23.07.2011 18:16:05
84 2011-07-24T12:31:49 <ThomasWaldmann> 2. Home
85 2011-07-24T12:31:50 <ThomasWaldmann> Revision: 0 Last Change: 12.03.2011 23:19:50
86 2011-07-24T12:32:02 <Marchael> yes
87 2011-07-24T12:32:14 <ThomasWaldmann> you search in all revs?
88 2011-07-24T12:32:24 <Marchael> yes
89 2011-07-24T12:32:28 <ThomasWaldmann> why?
90 2011-07-24T12:32:48 <Marchael> oops, no. I seach in latest
91 2011-07-24T12:33:42 <Marchael> iirc that happening due uuid problem
92 2011-07-24T12:34:41 <ThomasWaldmann> ah
93 2011-07-24T12:35:41 <ThomasWaldmann> yes, correct. it just doesn't kill rev 0 due to uuid.
94 2011-07-24T12:36:00 <ThomasWaldmann> after modifying Home, i get rev 0 and rev 2.
95 2011-07-24T12:37:37 <Marchael> you also could try Home* and see traceback
96 2011-07-24T12:37:45 <ThomasWaldmann> yes, i see it
97 2011-07-24T12:37:53 <Marchael> I'll fix that after finish with content indexing
98 2011-07-24T12:37:54 <ThomasWaldmann> i think it is due to py 2.6.0
99 2011-07-24T12:37:59 <Marchael> hm
100 2011-07-24T12:38:17 <ThomasWaldmann> we had that at another place too
101 2011-07-24T12:38:34 <ThomasWaldmann> unicode keywords require some later py version than 2.6.0
102 2011-07-24T12:39:43 * ThomasWaldmann files a whoosh bug about it
103 2011-07-24T12:46:14 <ThomasWaldmann> https://bitbucket.org/mchaput/whoosh/issue/157/unicode-keywords-are-not-supported-on
104 2011-07-24T12:46:56 <Marchael> ok, thx
105 2011-07-24T12:55:17 <Marchael> ThomasWaldmann: for accessing to revision content in backend I should use Item class?
106 2011-07-24T12:56:28 <Marchael> I get RuntimeError when trying to do that
107 2011-07-24T12:57:33 <ThomasWaldmann> see .../+indexable/Home
108 2011-07-24T12:57:57 <ThomasWaldmann> and its implementation in apps.frontend.views
109 2011-07-24T12:58:13 <Marchael> hm, then I should use test_request_context()?
110 2011-07-24T12:58:19 <ThomasWaldmann> btw, wildcard search works with my fix
111 2011-07-24T12:58:42 <ThomasWaldmann> but, highlighting doesn't. it shows nothing for the item name.
112 2011-07-24T13:00:33 <ThomasWaldmann> also it doesn't seem to highlight when not using wildcards
113 2011-07-24T13:00:51 <ThomasWaldmann> if i search for home and it finds home/foo, home is not highlighted
114 2011-07-24T13:03:59 <Marchael> hm
115 2011-07-24T13:04:30 <Marchael> you mean that highlighting doesn't work on fresh repo clone/
116 2011-07-24T13:04:31 <Marchael> ?
117 2011-07-24T13:05:16 <ThomasWaldmann> i killed the index yesterday and rebuilt it
118 2011-07-24T13:05:45 <ThomasWaldmann> and there are no changes in the code except that str keyword change to avoid the crash
119 2011-07-24T13:06:19 <ThomasWaldmann> you can add another patch for that btw
120 2011-07-24T13:08:39 <Marchael> http://i.imgur.com/XRA9t.png
121 2011-07-24T13:08:50 <Marchael> here result is highlighted
122 2011-07-24T13:09:36 <ThomasWaldmann> i don't see highlighting of Home
123 2011-07-24T13:10:09 <Marchael> em, whoosh highlighing all sentence
124 2011-07-24T13:11:37 *** Marchael
125 2011-07-24T13:18:20 *** Marchael
126 2011-07-24T13:19:23 <Marchael> ThomasWaldmann: all found documents are highlighted, but whoosh highlight all sentece
127 2011-07-24T13:20:04 <Marchael> i try use different fragmenters but no results
128 2011-07-24T13:21:46 <ThomasWaldmann> it's because it considers it as one word
129 2011-07-24T13:22:03 <ThomasWaldmann> if you have an item "Home Foo" and search for "foo", you see the difference
130 2011-07-24T13:25:03 <Marchael> hm, but whoosh haven't another freagmenters, should I write new/
131 2011-07-24T13:25:07 <Marchael> ?
132 2011-07-24T13:27:24 <ThomasWaldmann> first you please clean up the output of search results page
133 2011-07-24T13:27:50 <ThomasWaldmann> remove that TODO there and replace it by a form that has the current query filled in
134 2011-07-24T13:28:52 <ThomasWaldmann> then, you switch off (comment) highlighting of item name (item link) until it is fixed for wildcard search
135 2011-07-24T13:30:14 <Marchael> ok
136 2011-07-24T13:36:03 <ThomasWaldmann> (and make that query input field rather wide, so that users can easily expand their query)
137 2011-07-24T13:57:31 <Marchael> ThomasWaldmann: how can I get access to search_form?
138 2011-07-24T13:58:22 <Marchael> I get UndefinedError: 'search_form' is undefined when trying to access to it
139 2011-07-24T13:58:48 <Marchael> *get access
140 2011-07-24T14:04:12 *** raignarok
141 2011-07-24T14:08:20 <ThomasWaldmann> see show_item
142 2011-07-24T14:09:00 <Marchael> I see only show.html
143 2011-07-24T14:09:57 <ThomasWaldmann> i mean the function in views.py
144 2011-07-24T14:12:16 *** greg_f
145 2011-07-24T14:15:34 <ThomasWaldmann> btw, as search results is not dealing with some specific item, you don't want to extend show.html, but rather layout.html template
146 2011-07-24T14:28:08 <Marchael> ThomasWaldmann: but changes in layout.hml searchform will affect on all pages
147 2011-07-24T14:28:38 <Marchael> but i want only one
148 2011-07-24T14:28:59 <ThomasWaldmann> i didn't say that you shall edit layout.html
149 2011-07-24T14:29:31 <ThomasWaldmann> just not use "extends ... show.html", but "extends ... layout.html"
150 2011-07-24T14:29:57 <ThomasWaldmann> brb
151 2011-07-24T14:31:24 <Marchael> ah
152 2011-07-24T14:46:02 *** raignarok
153 2011-07-24T15:01:00 <ThomasWaldmann> Marchael: why don't I see name_exact when doing moin index show?
154 2011-07-24T15:01:37 <Marchael> because name_exact isn't stored
155 2011-07-24T15:02:28 <ThomasWaldmann> and why do you have 2 backend_to_index functions doing basically the same thing?
156 2011-07-24T15:03:47 <Marchael> at first variant they do the same thing, but accept different set of parameters
157 2011-07-24T15:05:12 <Marchael> but Item class is unavailable now, so I think what better put that function in some place and import it
158 2011-07-24T15:05:33 <Marchael> may be MoinMoin.util
159 2011-07-24T15:07:07 <ThomasWaldmann> as it is directly related to your schema, maybe have it near there
160 2011-07-24T15:08:20 <Marchael> so i should move backend_to_index() to util?
161 2011-07-24T15:11:26 <ThomasWaldmann> where's your schema?
162 2011-07-24T15:11:34 <Marchael> ok
163 2011-07-24T15:11:38 <ThomasWaldmann> Result: Page 1 of 1. Showing results 1 - 9 of 8 (0.002 secs)
164 2011-07-24T15:11:56 <ThomasWaldmann> this "of N" is obviously wrong
165 2011-07-24T15:12:23 <ThomasWaldmann> ehrm, no. i got 8. so the "9" is wrong.
166 2011-07-24T15:12:56 <Marchael> hm
167 2011-07-24T15:13:02 <Marchael> 9 - 1 = 8
168 2011-07-24T15:13:08 <Marchael> may be it's correct?
169 2011-07-24T15:15:02 <ThomasWaldmann> use your fingers to count, if it helps :P
170 2011-07-24T15:15:32 *** raignarok
171 2011-07-24T15:17:28 <Marchael> 9 - 1 = 8, my fingers approve that :D
172 2011-07-24T15:18:10 <ThomasWaldmann> if you have items that count from 1 to 9 (including both 1 and 9), i can assure you that these are 9 items, not 8.
173 2011-07-24T15:18:26 <Marchael> ok
174 2011-07-24T15:23:29 *** raignarok
175 2011-07-24T15:31:44 <ThomasWaldmann> brb
176 2011-07-24T16:40:13 <Marchael> bbl
177 2011-07-24T16:40:16 *** Marchael
178 2011-07-24T16:43:35 <xorAxAx> ThomasWaldmann, waldi: i wont be able to attend the next meeting
179 2011-07-24T16:43:40 <xorAxAx> waldi: can you cover for me?
180 2011-07-24T17:13:59 *** pkumar
181 2011-07-24T17:58:02 * ThomasWaldmann changed the (small) battery in a prius
182 2011-07-24T17:58:43 * ThomasWaldmann now hates some engineer who designed that corner of the car
183 2011-07-24T18:03:51 *** pkumar
184 2011-07-24T18:08:01 *** pkumar
185 2011-07-24T18:55:56 <ThomasWaldmann> Results for search 'mtime:[201107241200 to 201107241500]' < nice, such stuff works :)
186 2011-07-24T19:02:50 *** greg_f
187 2011-07-24T20:01:16 *** sinha
188 2011-07-24T20:05:30 *** pkumar
189 2011-07-24T20:24:34 <CIA-114> Akash Sinha <akash2607@gmail.com> * 17132086b9d6 r333 default/MoinMoin/ (8 files in 5 dirs): File upload functionality moved from index2 to index, index2 has been removed, as well as its link from itemviews bar also.
190 2011-07-24T20:24:34 <CIA-114> Thomas Waldmann <tw AT waldmann-edv DOT de> * 5158f027bb3f r334 default/MoinMoin/ (converter/__init__.py apps/frontend/views.py): (log message trimmed)
191 2011-07-24T20:24:34 <CIA-114> use converters to convert a rev to indexable content
192 2011-07-24T20:24:34 <CIA-114> revision data may contain all sorts of content-types, e.g.:
193 2011-07-24T20:24:34 <CIA-114> - markup text (moin, creole, docbook, rst, html, ...)
194 2011-07-24T20:24:34 <CIA-114> - plain text (code, docs, ...)
195 2011-07-24T20:24:35 <CIA-114> - images, audio
196 2011-07-24T20:24:35 <CIA-114> - archives and other binary stuff
197 2011-07-24T20:24:36 <CIA-114> Thomas Waldmann <tw AT waldmann-edv DOT de> * 72009982fc8d r335 default/MoinMoin/converter/ (__init__.py text_out.py): add a dom -> text/plain output converter, use it to create indexable content
198 2011-07-24T20:24:36 <CIA-114> Akash Sinha <akash2607@gmail.com> * 56e5fe3c1966 r336 default/MoinMoin/ (3 files in 2 dirs): branch merged with main repo
199 2011-07-24T21:23:49 <ThomasWaldmann> docs = all_revs_searcher.documents(wikiname=self.wikiname)
200 2011-07-24T21:23:49 <ThomasWaldmann> for doc in sorted(docs, reverse=reverse)[start:end]:
201 2011-07-24T21:24:14 <ThomasWaldmann> hmm, sorting dicts?
202 2011-07-24T21:38:03 * ThomasWaldmann filed a bug in marchael's tracker
203 2011-07-24T22:45:14 <dreimark> re
204 2011-07-24T22:46:15 <sinha> dreimark: http://www.moinmo.in/AkashSinha/Gsoc2011Diary/2011-07-24
205 2011-07-24T22:46:40 <dreimark> sinha: just seen
206 2011-07-24T23:02:01 <dreimark> sinha: reviewed, util is a good place
207 2011-07-24T23:07:48 <dreimark> sinha: index did not show all items i have in History
208 2011-07-24T23:08:16 <dreimark> you should also show those which mimetype is not in our list
209 2011-07-24T23:08:34 <sinha> dreimark: i was not getting any appropriate name, so just made it `result` for the mean time, i will rename it correctly
210 2011-07-24T23:09:53 <sinha> dreimark: in item index ? yes i have also seen something like that lately
211 2011-07-24T23:10:31 <sinha> i have observed some item names with extensions like (xyz.gif) is not appearing there
212 2011-07-24T23:10:32 <dreimark> http://test.moinmo.in/issue1
213 2011-07-24T23:10:36 <dreimark> sinha: ^
214 2011-07-24T23:10:58 <dreimark> I don't see odp, com, txt files
215 2011-07-24T23:11:07 <dreimark> only the ogg is in the index
216 2011-07-24T23:13:04 <dreimark> if you can map it to a type moin has defined you also should show it
217 2011-07-24T23:13:14 <sinha> dreimark: there might be some issue in contenttype selection in flat_index function
218 2011-07-24T23:13:22 <dreimark> s/can/can not/
219 2011-07-24T23:14:13 <dreimark> there are more problems, "contenttype": "text/plain",
220 2011-07-24T23:14:20 <dreimark> is also not showing up
221 2011-07-24T23:14:46 <dreimark> we have plain text also definied
222 2011-07-24T23:15:16 <dreimark> at the upload time the new items are shown
223 2011-07-24T23:15:21 <dreimark> but not on reload
224 2011-07-24T23:17:25 <ThomasWaldmann> hmm, i guess i have a solution for flexible metadata indexing
225 2011-07-24T23:19:27 <ThomasWaldmann> http://packages.python.org/Whoosh/schema.html#dynamic-fields we just add one glob per datatype we want, done :)
226 2011-07-24T23:24:06 <ThomasWaldmann> e.g. *_usertext -> every metadata key that matches will be indexed as TEXT
227 2011-07-24T23:25:12 <sinha> dreimark: file with .txt formats get this text/plain mime type but at our contenttype_groups we have defined it as 'text/plain;charset=utf-8' so thats why it is not getting selected and also image/gif is missing from the contenttype_group.
228 2011-07-24T23:25:46 <sinha> dreimark: how do you wnat me to handle this ? do i removed this charset part and then apply this contnettype_check
229 2011-07-24T23:25:58 <sinha> s/wnat/want
230 2011-07-24T23:26:15 <sinha> s/removed/remove
231 2011-07-24T23:27:49 <dreimark> two problems, one meta data for contenttype is not completly given on upload
232 2011-07-24T23:29:10 <dreimark> for this functionality we have not look at the encoding
233 2011-07-24T23:29:34 <dreimark> so create the index without looking at the encoding
234 2011-07-24T23:30:32 <dreimark> there should be also a ToDo added for adding encoding on upload
235 2011-07-24T23:30:37 <dreimark> sinha: ^^
236 2011-07-24T23:33:23 <sinha> dreimark: so by ignoring encoding you mean, just see the major type of item ?
237 2011-07-24T23:33:46 <dreimark> encoding == ;charset=utf-8
238 2011-07-24T23:34:14 <dreimark> major/minor == text/plain
239 2011-07-24T23:34:24 <sinha> but for the files .odp extension i am getting the mime as "application/vnd.oasis.opendocument.presentation" so how will i handle this one
240 2011-07-24T23:34:31 <ThomasWaldmann> the Type class has a issupertype method
241 2011-07-24T23:34:50 <ThomasWaldmann> including wildcard support
242 2011-07-24T23:35:09 <dreimark> you need a mapping to unknown so nothing get lost
243 2011-07-24T23:37:06 <sinha> okay, so i) do the check without the encoding ii) include those which are unknown ? OR i) skip the encoding thing, include which are unknown
244 2011-07-24T23:40:42 <dreimark> skip the encodig. later if we have on each the encoding then it may be iteresting to filter for those too
245 2011-07-24T23:41:00 <dreimark> but we can't currently
246 2011-07-24T23:46:46 <sinha> by skipping encoding i meant in flat_index when i filter by contenttype then i will consider text/plain as unknown type ?
247 2011-07-24T23:48:20 <dreimark> text/plain is not unknown, why do you think?
248 2011-07-24T23:48:49 <dreimark> http://test.moinmo.in/+modify/text
249 2011-07-24T23:49:07 <dreimark> see first of other "text items"
250 2011-07-24T23:49:10 <dreimark> sinha:
251 2011-07-24T23:49:43 <dreimark> unknown should be those not listed there
252 2011-07-24T23:49:55 <sinha> dreimark: yes i know it is there, but in tht contenttype group it is saved as text/plain;charset=utf8
253 2011-07-24T23:50:16 <sinha> so during filter, text/plain is not same as text/plain;charset=utf8
254 2011-07-24T23:50:27 <sinha> and thats why it was skipped in first place
255 2011-07-24T23:50:32 <dreimark> it is the same if you compare it without encoding
256 2011-07-24T23:51:37 <sinha> okay just compare it before ";charset=.."
257 2011-07-24T23:51:50 <dreimark> yes
258 2011-07-24T23:52:21 <dreimark> and add a todo to the uploader, that we find that missing encoding later
259 2011-07-24T23:53:24 <sinha> okay