The XhtmlDegraderMiddleware class is designed to make it easier to deploy XHTML contents onto the World Wide Web. The correct MIME media type for sending XHTML is application/xhtml+xml
-- the text/html
media type is also acceptable provided certain guidelines are followed.
The vast majority of web browsers released from 2002 onwards have good XHTML support; this includes Mozilla Firefox, Opera, and Apple Safari. Two notable exceptions are Internet Explorer 7.0 and Netscape Navigator 4.8; instead of displaying XHTML, they present the user with a download dialog instead.
The role of the XHTML Degrader, then, is to automatically detect when browsers do not support XHTML, and to degrade the contents into something they're capable of rendering.
How it works
XhtmlDegraderMiddleware checks the content type of all HTTP responses. If the XHTML media type is set, the Accept
request header is examined to determine whether the user agent actually supports XHTML. If so, the contents is sent unaltered. If not, a number of silent changes are made to make the response more friendly to XHTML-challenged browsers and web crawlers.
Firstly, the Content-Type
header is set to the HTML media type. Any XML processing instructions are removed, and a DOCTYPE
is added if none is present. In the case of Internet Explorer, this should ensure the content is rendered in "standards compliance" mode. Empty elements are made to have a space before their trailing slash.
N.B. If an HTTP response is already set to text/html
, or set to any media type other than application/xhtml+xml
, the middleware will have no effect. Note also that if you use GZipMiddleware, you should ensure that it appears in your MIDDLEWARE_CLASSES setting before XhtmlDegraderMiddleware, to allow the XHTML Degrader to act first.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | """
XHTML Degrader Middleware.
When sending contents with the XHTML media type, application/xhtml+xml, this
module checks to ensure that the user agent (web browser) is capable of
rendering it. If not, it changes the media type to text/html and makes the
contents more "HTML-friendly" (as per the XHTML 1.0 HTML Compatibility
Guidelines).
To use this middleware you need to add a reference to it in your settings.py
file, e.g.:
MIDDLEWARE_CLASSES = (
...
'YOURPATH.xhtmldegrader.middleware.XhtmlDegraderMiddleware',
)
(If you use GZipMiddleware, you should ensure that it appears in the list
before XhtmlDegraderMiddleware, to allow the XHTML Degrader to act first.)
"""
import re
_MEDIA_TYPE_RE = re.compile(r'application\/xhtml\+xml')
_EMPTY_TAG_END_RE = re.compile(r'(?<=\S)\/\>')
_PROCESSING_INSTRUCTION_RE = re.compile(r'\<\?.*\?\>')
def _supports_xhtml(request):
"""Examines an HTTP request header to determine whether the user agent
supports the XHTML media type (application/xhtml+xml). Returns True or
False."""
if '/xhtml+xml' in request.META.get('HTTP_ACCEPT', '').lower():
# User agent claims to support the XHTML media type.
return True
else:
# No reference to XHTML support.
return False
class XhtmlDegraderMiddleware(object):
"""Django middleware that "degrades" any contents sent as XHTML if the
requesting browser doesn't support the XHTML media type. Degrading involves
switching the content type to text/html, removing XML processing
instructions, etc.
If the HTTP response is already set to text/html, or set to any media type
other than application/xhtml+xml, this middleware will have no effect.
"""
def process_response(self, request, response):
# Check if this is XHTML, and check if the user agent supports XHTML.
if response['Content-Type'].split(';')[0] != 'application/xhtml+xml' \
or _supports_xhtml(request):
# The content is fine, simply return it.
return response
# If the response has already been compressed we can't modify it
# further, so just return it. (N.B. if you use GZipMiddleware, you
# should ensure that it's listed in MIDDLEWARE_CLASSES before
# XhtmlDegraderMiddleware, to allow the XHTML Degrader to act first.)
if response.has_header('Content-Encoding'):
# Already compressed, so we can't do anything useful with it.
return response
# The content is XHTML, and the user agent doesn't support it.
# Fix the media type:
response['Content-Type'] = _MEDIA_TYPE_RE.sub('text/html',
response['Content-Type'], 1)
if 'charset' not in response['Content-Type']:
response['Content-Type'] += '; charset=utf-8'
# Modify the response contents as required:
# Remove any XML processing instructions:
response.content = _PROCESSING_INSTRUCTION_RE.sub('',
response.content)
# Ensure there's a space before the trailing '/>' of empty elements:
response.content = _EMPTY_TAG_END_RE.sub(' />',
response.content)
# Lose any excess whitespace:
response.content = response.content.strip()
if not response.content.startswith('<!DOCTYPE'):
# Add a DOCTYPE, so that the user agent isn't in "quirks" mode.
response.content = '<!DOCTYPE html>\n' + response.content
return response
|
More like this
- Template tag - list punctuation for a list of items by shapiromatron 10 months, 2 weeks ago
- JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 10 months, 3 weeks ago
- Serializer factory with Django Rest Framework by julio 1 year, 5 months ago
- Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 6 months ago
- Help text hyperlinks by sa2812 1 year, 6 months ago
Comments
instead of
response.content[0:9] != '<!DOCTYPE':
I would suggest
not response.content.startswith('<!DOCTYPE')
This is more speed & memory efficient.
#
Great snippet, thanks.
One small suggestion, maybe you could also replace:
meta http-equiv="Content-type" content="application/xhtml+xml
... etc, in the head section. To do so, I added:at the beginning:
_MEDIA_TYPE_HEAD_RE = re.compile(r'="application\/xhtml\+xml')
and around line 70 add:
response.content = _MEDIA_TYPE_HEAD_RE.sub('="text/html', response.content, 1)
#
I've made another little adjustment. I noticed that django-admin has several problems when served as application/xhtml+xml, so I adjusted this middleware so that anything served fro mthe /admin url is automatically degrade d too:
def process_response(self, request, response):
if not request.META['PATH_INFO'].startswith('/admin/'):
#
Looks like there is a bit too much escaping going on in the regular expressions:
_MEDIA_TYPE_RE = re.compile(r'application\/xhtml\+xml')
Can simply be:
_MEDIA_TYPE_RE = re.compile(r'application/xhtml\+xml')
and
_PROCESSING_INSTRUCTION_RE = re.compile(r'\<\?.*\?\>')
Can simple be:
_PROCESSING_INSTRUCTION_RE = re.compile(r'<\?.+\?>')
The following regex didn't make much sense at all, I don't know how the = fit in there, aswell as some of the other parts, here is the original:
_EMPTY_TAG_END_RE = re.compile(r'(?<=\S)\/\>')
Anyway, I rewrote it and this one works:
_EMPTY_TAG_END_RE = re.compile(r'(<.+)/>')
Also replace line 75 with this, or it wont work:
response.content = _EMPTY_TAG_END_RE.sub(r'\1>', response.content)
What this does is replace:
\<input type="button" /> with \<input type="button" > and \<input type="button"/> with \<input type="button">
I cannot get rid of the space at the end in the first example, not without adding extra python code anyway, and since the extra space does nothing bad, I just left it at that.
#
Sorry, I meant a * not a +:
_PROCESSING_INSTRUCTION_RE = re.compile(r'<\?.*\?>')
#
Please login first before commenting.