Accessing pages with lots of attachments is too expensive

Our wiki (wiki.ubuntuusers.de) has a page with lots of file attachments (5000+ files).

Apparently a crawler or whatever found it. That killed the wiki.

Steps to reproduce

  1. Create a big Wiki.
  2. Access /Wiki/Cache?action=AttachFile&do=view&target=_some_file_in_the_cache.png

  3. Watch the Wiki twiddle its thumbs for a *long* time.

Details

Each access to a page with attachments creates rel=... links for all the attachments. If there are 5000+ attachments, this is deadly.

(Version: 1.5.1.)

That memory problem still exists in 1.5.5a - it exists when one uploads files. It prevents uploading large files on servers with low memory. (!) This should be moved to a separate bug report + more details.

(!) The parts of this bug report about "slow download speeds" and "huge memory consumption on download" issues, including steps to reproduce it and all platform details, were moved to ../AttachmentDownloadsSlowAndConsumeTooMuchMemory.

Workaround

For now, I have patched moin thus:

diff -rub MoinMoin/action/AttachFile.py /tmp/MoinMoin/action/AttachFile.py
--- MoinMoin/action/AttachFile.py       2006-01-21 17:48:04.000000000 +0100
+++ /tmp/MoinMoin/action/AttachFile.py  2006-02-09 13:31:23.000000000 +0100
@@ -211,6 +211,10 @@
     str = ""
     if files:
+        if len(files) > 100:
+            if showheader:
+                return '%s<p>%s</p>' % (str, _("Too many attachments stored for %(pagename)s") % {'pagename': pagename})
+
         if showheader:
             str = str + _(
                 "To refer to attachments on a page, use '''{{{attachment:filename}}}''', \n"
@@ -322,6 +326,7 @@
 #############################################################################
 def send_link_rel(request, pagename):
     files = _get_files(request, pagename)
+    if len(files) > 100: return
     if len(files) > 0 and not htdocs_access(request):
         scriptName = request.getScriptname()

Discussion

This is mainly a DOS problem. I admit that the <link> tags are not really sensible. Does anybody object to remove them at all?

Plan


CategoryMoinMoinBug

MoinMoin: MoinMoinBugs/AttachmentsScalabilityProblem (last edited 2008-05-04 23:05:28 by ThomasWaldmann)