Description
The code which builds the trail is not aware of race conditions. See the traceback below.
This is not just a problem in this module. MoinMoin does not check concurrency issues at other places either. We should refactor that.
Steps to reproduce
Just open different pages concurrently while you are logged in. You should use a suitable tool for it, otherwise you might be too slow to see the race condition.
Example
Traceback (most recent call last): File "/home/twaldmann/moincvs/moin--main--1.3/MoinMoin/request.py", line 756, in run handler(page.page_name, self) File "/home/twaldmann/moincvs/moin--main--1.3/MoinMoin/wikiaction.py", line 474, in do_refresh do_show(pagename, request) File "/home/twaldmann/moincvs/moin--main--1.3/MoinMoin/wikiaction.py", line 459, in do_show Page(request, pagename).send_page(request, count_hit=1) File "/home/twaldmann/moincvs/moin--main--1.3/MoinMoin/Page.py", line 873, in send_page request.user.addTrail(self.page_name) File "/home/twaldmann/moincvs/moin--main--1.3/MoinMoin/user.py", line 706, in addTrail self.getTrail() File "/home/twaldmann/moincvs/moin--main--1.3/MoinMoin/user.py", line 746, in getTrail self._trail = codecs.open(self.__filename() + ".trail", 'r', config.charset).readlines() File "/usr/local/lib/python2.4/codecs.py", line 411, in readlines return self.reader.readlines(sizehint) File "/usr/local/lib/python2.4/codecs.py", line 336, in readlines data = self.read() File "/usr/local/lib/python2.4/codecs.py", line 282, in read newdata = self.stream.read() IOError: [Errno 5] Input/output error
Details
This Wiki.
Workaround
Reload the failing page.
Discussion
Can we reproduce this with 2 browsers or in normal usage?
You could write 2 small python programs. One would read from a file, the other add write to it, concurrently. Or you could write a script for a suite which is written for web app stress testing. Sorry, but I just know such programs for Windows. But I am sure that it will not help you much because you can easily see in the code that this problem exists.
It was not my question - does this happen in typical use?
Then this is not a new problem, do we want to fix this in 1.3? or maybe improve (create?) locking on 1.4?
We should improve such file access patterns. I am not sure if we really need FS file locking or lock files. If there is a simple solution, we might want to include it into 1.3.
This needs global refactoring, it is not the only broken pattern.
User files
For user files, we can fix is easily by writing to a temp file, then renaming to the final file name (atomic on posix). I don't know if this will work on window though. Our Windows developers will have to solve it
Here is a first patch, that need reviewing and testing on different platforms. It works fine here on Mac OS X.
- This does not solve the initial bug, at least not on Windows. IMHO we need:
- Small refactoring to do just a single write().
- Write a neat file system locking class.
- Use file system locking.
Fixed partially in patch-698 by saving the file in one write call, instead of multiple writes. This should be safe enough for normal use.
Stuff that should be done:
- Same fix for reading, read all file into memory, not line by line
- Same fix for user data file
- Locking.
This patch fix the handling of the trail. The behavior is this:
- Trail is initialized to None
- When user data is read from the user file, trail is initialized again to None
- getTrail initialize the trail once. For valid user that uses one of show_trail or remember_last_visit, it initialize from the disk. Other initialized to empty list.
- Later calls to getTrail return the cached value, no disk access
- addTrail always get an updated trail from the disk, in case another process changed the trail e.g user with few browsers or browsers windows on same wiki.
- Trail methods refactored to make theme simple and easy to understand and remove duplicate code.
- Don't use string module
I'm not committing this because others happen to work on same code currently.
1 * looking for nirs@freeshell.org--2005/moin--fix--1.3--patch-23 to compare with
2 * comparing to nirs@freeshell.org--2005/moin--fix--1.3--patch-23
3 M MoinMoin/user.py
4
5 * modified files
6
7 --- orig/MoinMoin/user.py
8 +++ mod/MoinMoin/user.py
9 @@ -6,7 +6,7 @@
10 @license: GNU GPL, see COPYING for details.
11 """
12
13 -import os, string, time, Cookie, sha, codecs
14 +import os, time, Cookie, sha, codecs
15
16 try:
17 import cPickle as pickle
18 @@ -259,7 +259,8 @@
19
20 # attrs not saved to profile
21 self._request = request
22 - self._trail = []
23 + # This means trail is not known yet, must read from disk
24 + self._trail = None
25
26 # create checkbox fields (with default 0)
27 for key, label in self._checkbox_fields:
28 @@ -410,8 +411,8 @@
29 if -24 <= self.tz_offset and self.tz_offset <= 24:
30 self.tz_offset = self.tz_offset * 3600
31
32 - # clear trail
33 - self._trail = []
34 + # clear trail, must read from disk
35 + self._trail = None
36
37 if not self.disabled:
38 self.valid = 1
39 @@ -686,47 +687,63 @@
40 return 1
41 return 0
42
43 + def shouldAddTrail(self, pageName):
44 + """ Return true if name should be added to trail """
45 + if not self.valid:
46 + return False
47 +
48 + # Don't update if trail is not used currently.
49 + # TODO: This cause a little problem when activating 'show page
50 + # trail' since its starts with empty trail. If we remove this we
51 + # will have to pay one disk access per request for valid users.
52 + if not (self.show_page_trail or self.remember_last_visit):
53 + return False
54 +
55 + # Don't add missing pages or those the user may not read
56 + if self._request:
57 + from MoinMoin.Page import Page
58 + page = Page(self._request, pageName)
59 + if not (page.exists() and
60 + self._request.user.may.read(page.page_name)):
61 + return False
62 + return True
63
64 def addTrail(self, pagename):
65 - """
66 - Add page to trail.
67 + """ Add page to trail.
68
69 @param pagename: the page name to add to the trail
70 """
71 - # TODO: aquire lock here, so multiple processes don't clober
72 - # each one trail.
73 -
74 - if self.valid and (self.show_page_trail or self.remember_last_visit):
75 - # load trail if not known
76 - self.getTrail()
77 -
78 - # don't append tail to trail ;)
79 - if self._trail and self._trail[-1] == pagename: return
80 + if self.shouldAddTrail(pagename):
81 + # TODO: aquire lock so another process may not read or write
82 + # trail until we release the lock.
83 +
84 + # Becuase one user can use multiple browsers, and each
85 + # session might add pages to the trail, we must read trail
86 + # from disk before adding items to the trail.
87 + self._loadTrail()
88 +
89 + # Don't append tail to trail
90 + if not pagename in self._trail[-1:]:
91 + # Remove pagename duplicates (from multiple sessions?)
92 + self._trail = [name for name in self._trail
93 + if name != pagename]
94 + self._trail.append(pagename)
95 + self._trail = self._trail[-self._cfg.trail_size:]
96 + self._saveTrail()
97
98 - # Add only existing pages that the user may read
99 - if self._request:
100 - from MoinMoin.Page import Page
101 - page = Page(self._request, pagename)
102 - if not (page.exists() and
103 - self._request.user.may.read(page.page_name)):
104 - return
105 -
106 - # append new page, limiting the length
107 - self._trail = filter(lambda p, pn=pagename: p != pn, self._trail)
108 - self._trail = self._trail[-(self._cfg.trail_size-1):]
109 - self._trail.append(pagename)
110 - self.saveTrail()
111 + # TODO: release lock
112
113 - ## TODO: release lock here
114 -
115 - def saveTrail(self):
116 + def _trailPath(self):
117 + return self.__filename() + ".trail"
118 +
119 + def _saveTrail(self):
120 """ Save trail file
121
122 Save using one write call, which should be fine in most cases,
123 but will fail in rare cases without real file locking.
124 """
125 data = '\n'.join(self._trail) + '\n'
126 - path = self.__filename() + ".trail"
127 + path = self._trailPath()
128 try:
129 file = codecs.open(path, "w", config.charset)
130 try:
131 @@ -745,22 +762,45 @@
132 self._request.log("Can't save trail file: %s" % str(err))
133
134 def getTrail(self):
135 - """
136 - Return list of recently visited pages.
137 + """ Return list of recently visited pages.
138 +
139 + Get the trail from disk once per user life span, which is one
140 + request. Later return cached trail.
141
142 @rtype: list
143 @return: pages in trail
144 """
145 - if self.valid and (self.show_page_trail or self.remember_last_visit) \
146 - and not self._trail \
147 - and os.path.exists(self.__filename() + ".trail"):
148 - try:
149 - self._trail = codecs.open(self.__filename() + ".trail", 'r', config.charset).readlines()
150 - except (OSError, ValueError):
151 - self._trail = []
152 + if self._trail is None:
153 + # Initialize trail from disk if used, or to empty list
154 + if (self.valid and
155 + (self.show_page_trail or self.remember_last_visit)):
156 + self._loadTrail()
157 else:
158 - self._trail = filter(None, map(string.strip, self._trail))
159 - self._trail = self._trail[-self._cfg.trail_size:]
160 -
161 + self._trail = []
162 return self._trail
163
164 + def _loadTrail(self):
165 + """ Load trail from disk or set to empty list
166 +
167 + Should be safe for normal use, even safer if we add file
168 + locking.
169 + """
170 + path = self._trailPath()
171 + # TODO: aquire lock
172 + try:
173 + file = codecs.open(path, 'r', config.charset)
174 + try:
175 + trail = file.readlines()
176 + finally:
177 + file.close()
178 + trail = [name.strip() for name in trail
179 + if name != '' and not name.isspace()]
180 + self._trail = trail[-self._cfg.trail_size:]
181 + except (IOError, OSError, ValueError), err:
182 + # Don't log this because its ok if the trail was deleted,
183 + # its a problem only if we can't create it later.
184 + self._trail = []
185 + # TODO: release lock
186 +
187 +
188 +
189
Maybe its practically fixed by the partial fix in 698?
Plan
- Priority:
- Assigned to:
- Status:
- fixed partially in patch-698 (should be safe enough for regular use)
- trail handling changed completely in 1.6 (trail storage was moved to session data), so this bug relating to old trail code does not apply any more. Closing.