Description
By default, MoinMoin blocks wget from downloading attachments (and doing anything else other than viewing pages), because it considers wget to be a web spider. Although wget can act as a spider, it's not generally used as one. I think wget should be allowed to at least download attachments. (And perhaps other spiders as well?)
Also, the error message "You are not allowed to access this!" could use some improvement.
Steps to reproduce
wget "http://moinmoin.wikiwikiweb.de/OliverGraf/VimColor?action=AttachFile&do=get&target=VimColor.py"
Details
$ telnet moinmoin.wikiwikiweb.de 80 Trying 83.137.100.43... Connected to host43.thinkmo.de. Escape character is '^]'. GET /OliverGraf/VimColor?action=AttachFile&do=get&target=VimColor.py HTTP/1.0 User-Agent: Wget/1.9 Host: moinmoin.wikiwikiweb.de Accept: */* Connection: Keep-Alive HTTP/1.0 403 Forbidden Date: Tue, 28 Dec 2004 15:48:41 GMT Status: 403 FORBIDDEN Content-type: text/plain Server: TwistedWeb/1.3.0rc1 You are not allowed to access this! Connection closed by foreign host.
$ telnet moinmoin.wikiwikiweb.de 80 Trying 83.137.100.43... Connected to host43.thinkmo.de. Escape character is '^]'. GET /OliverGraf/VimColor?action=AttachFile&do=get&target=VimColor.py HTTP/1.0 User-Agent: SomethingElse Host: moinmoin.wikiwikiweb.de Accept: */* Connection: Keep-Alive HTTP/1.0 200 OK Date: Tue, 28 Dec 2004 15:55:29 GMT Content-length: 9111 Content-type: text/plain Content-disposition: inline; filename="VimColor.py" Server: TwistedWeb/1.3.0rc1 ...
Workaround
You can use the --user-agent="..." option to override the User-Agent header for wget.
Alternatively you can configure your Wiki not to treat wget as a spider, overriding the ua_spiders configuration variable (see below)
Discussion
Is this bug caused by Twisted or MoinMoin?
MoinMoin. It has some routines which filter certain UAs.
Is including wget among these defaults justified? I can understand blocking spiders from editing pages, but shouldn't spiders be allowed to download/view attachments?
/usr/lib/python2.3/site-packages/MoinMoin/multiconfig.py line 212:
# a regex of HTTP_USER_AGENTS that should be excluded from logging # and receive a FORBIDDEN for anything except viewing a page ua_spiders = ('archiver|crawler|curl|google|htdig|httrack|jeeves|larbin|leech|' 'linkbot|linkmap|linkwalk|mercator|mirror|nutbot|robot|scooter|' 'search|sitecheck|spider|wget')
This is not a bug but a feature, and the wiki admin is free to configure this.
We can consider allowing also attachments viewing getting.
-- NirSoffer 2004-12-29 04:48:28
Plan
- Priority:
- Assigned to:
Status: bug 1 confirmed and fixed. Additionally, 1.3.2 will allow spiders to download attachments
standalone always answers with status code 200 (1)