Moin should return Last-Modified headers, so that http proxies and caches can cache content and advanced setups can do performance improvements using for example mod_cache or squid. From #moin:
The discussion below is wrong! Most browsers use, in presence of last-modified, a heuristic described at end of rfc2616 13.2.4: http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2.4. Mozilla uses this heuristic according to http://www.mozilla.org/projects/netlib/http/http-caching-faq.html
- What exactly in that discussion is wrong? Since when is a caching proxy called IE or Mozilla? Why does reading something completly is helpful sometimes?
[11:44] LeoSimons: Hi gang! over at wiki.apache.org, we've been having some performance issues ever since upgrading to moin 1.3 and at the same time migrating from a FreeBSD machine to RHEL (so we're not sure which of the two made the difference). Regardless, we've been looking into using mod_cache to reduce the load. Some details are at http://issues.apache.org/jira/browse/INFRA-360, including a patch for more conservatie Cache-Control headers (that's probably a little redundant) that also sets Last-Modified headers. [11:44] LeoSimons: I would suggest someone takes a look at our patch, fixes it (its no doubt not the best way to do things) and integrates this into moin proper. It should help a lot with load reduction, esp. for people using proxies. [11:45] LeoSimons: (setting Last-Modified allows content negotiation via If-Modified-Since) [11:46] xorAxAx: LeoSimons: on which moinmoin version are you currently? [11:46] xorAxAx: LeoSimons: am i right at this point - the patch will just help if you use a reverse caching proxy [11:46] LeoSimons: 1.3.4 (or 1.3.3, eh...) [11:48] LeoSimons: xorAxAx: setting a Last-Modifed allows all kinds of caching strategies to happen between your server and user X. In our case the caching is always going to happen within apache itself (using mod_cache), which is a kind-of reverse caching proxy, but other setups are not unthinkable. [11:49] xorAxAx: LeoSimons: but the straight use of a browser wont be sped up by this patch [11:49] xorAxAx: LeoSimons: would it? [11:49] LeoSimons: xorAxAx: correct. [11:49] xorAxAx: we would need to send not-modified [11:49] xorAxAx: but its worth to consider merging, thanks [11:50] xorAxAx: i dont like the opt-out thing, though [11:50] LeoSimons: xorAxAx: actually, what would need to be done would be a shortcut through the moin code that supports HEAD requests [11:50] xorAxAx: yeah [11:50] LeoSimons: content-negotiation sometimes fetches headers to determine whether it should fetch a body. Rendering the body is expensive for wikis. [11:51] xorAxAx: maybe make an official feature request on the wiki
The user problem
Each moin response contain user-dependent items:
- All users/visitors:
- Browser Language - same page will be rendered differently for two anonymous users if they have different preferred language in their browser.
- Registered users get:
- link to home page
- link to user preferences
- list of page links (tabs in modern)
- list of page trail
- custom theme - each user may have different theme
- custom rendering of translated user interface
- custom rendering of page content
Requesting the same unchanged page by different users require that moin will handle all requests. Moin itself caches page content, rendering page content takes about 1/10 of the time of a typical request.
- Right, that is his point - there are many things that are not cached yet.
For improving at the http level, Last-Modified header would be far too simple (see above). Maybe we could use an Etag header made out of the key values influencing the content. -- ThomasWaldmann 2005-06-05 15:28:02
Some of the interesting stuff shown there can't be used for moin as it is now, because moin emits page content in a streaming way (many small request.write() calls) and does not collect the whole http request response before sending it out. I guess we could do that with no problem for small/simple pages, but I have doubts whether it would be ok for stuff like TitleIndex or WordIndex or long/complex pages. If someone has time to do an experimental implementation of this with benchmarks, I would be very curious about the results. -- ThomasWaldmann 2006-10-25 10:28:31
Well, for me, an ETag would be a very elegant solution to all of this. There could be some kind of hook in the macros that would allow them to add stuff to the tag. The basic core tag would be something like W"user/1116859300", where user is the username and the number the last modification date. A weak etag, is because the content of the page itself *might* have changed and we don't garantee that the macros embedded in the page have not changed content... -- TheAnarcat 2006-12-11 23:36:27 How does such an ETag help, if it doesn't guarantee that the macros' output did not change?
ETag is the best long-term solution, but Last-Modified would already be a very good help. Another nice trick would be to have a hook to trigger a custom action or even an HTTP PURGE request to registered reverse-proxies upon edits. -- FrancescoChemolli 2009-09-30 15:24:19 Maybe you could help us doing that within moin/2.0? It'll work very differently, so stuff like that should be easier than in 1.x. Currently 2.0 uses the revision's content hash for the etag (if the request is for the raw content, like for images or files).
Compare patch to support mod_proxy.
- How does this patch work if people use different languages?
- It will not work for people with different browser language.
The patch looks correct but creating the last modified header should not be done in Page.py. It should be probably in request. Also, maybe all the logic can be in one function:
def canUseHTTPCaching(self, action): return not self.user.valid and action in <list of cacheable actions> and <other logic>
- Yeah, much better.
Then another function to set the last modified header.