Description
When accessing any page with spider user agent, you get HTTP/1.1 403 FORBIDDEN response AND the page content, instead of <h1>Forbidden</h> or similar error.
Example
curl --head --user-agent google http://nirs.dyndns.org/main/FrontPage HTTP/1.1 403 FORBIDDEN Date: Thu, 09 Jun 2005 19:05:56 GMT Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4 Content-Type: text/plain; charset=ISO-8859-1 curl --user-agent google http://nirs.dyndns.org/main/FrontPage <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <meta name="robots" content="index,follow">
Why is this forbidden?
curl --head --user-agent google http://nirs.dyndns.org/main/HelpContents HTTP/1.1 403 FORBIDDEN Date: Thu, 09 Jun 2005 19:10:00 GMT Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4 Content-Type: text/plain; charset=ISO-8859-1 curl --user-agent google http://nirs.dyndns.org/main/HelpContents <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <meta name="robots" content="index,nofollow">
Why is this forbidden?
curl --head --user-agent google http://nirs.dyndns.org/main/NoSuchPageHere! HTTP/1.1 403 FORBIDDEN Date: Thu, 09 Jun 2005 19:19:06 GMT Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4 Content-Type: text/plain; charset=ISO-8859-1 curl --user-agent google http://nirs.dyndns.org/main/NoSuchPageHere! <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <meta name="robots" content="index,nofollow">
It make sense that this page is forbidden - but then why we return content?
Details
Release 1.3.4
Workaround
Discussion
This is not a bug in moin, but cause by curl --head to try a HEAD request, which we do not support, or only partially support. When trying with wget -s, we get correct results:
wget --user-agent=google -s http://nirs.dyndns.org/main/FrontPage HTTP/1.1 200 OK Date: Thu, 09 Jun 2005 19:51:51 GMT Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4 Connection: close Content-Type: text/html;charset=utf-8 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st rict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <meta name="robots" content="index,follow"> wget --user-agent=google -s http://nirs.dyndns.org/main/HelpContents HTTP/1.1 200 OK Date: Thu, 09 Jun 2005 19:54:40 GMT Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4 Connection: close Content-Type: text/html;charset=utf-8 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st rict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <meta name="robots" content="index,nofollow"> wget --user-agent=google -s http://nirs.dyndns.org/main/NoSuchPageHere --22:55:52-- http://nirs.dyndns.org/main/NoSuchPageHere => `NoSuchPageHere' Resolving nirs.dyndns.org... 192.115.134.51 Connecting to nirs.dyndns.org[192.115.134.51]:80... connected. HTTP request sent, awaiting response... 404 NOTFOUND 22:55:53 ERROR 404: NOTFOUND.
Pain
Urf. After couple hours of pointess debugging why one monitoring framework gets 403 for moin I stumbled upon this one. How about returning 501 Not Implemented for HEAD request instead of 403 Forbidden? Thanks!
Plan
- Priority:
- Assigned to:
- Status: