HTML5 filter for XXS

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from django import template
from django.template.defaultfilters import stringfilter

register = template.Library()

import html5lib
from html5lib import sanitizer

@register.filter
@stringfilter
def sanitize(value):
    p = html5lib.HTMLParser(tokenizer=sanitizer.HTMLSanitizer)
    return p.parseFragment(value).toxml()

More like this

  1. TinyMCE Widget by semente 3 years, 8 months ago
  2. Sanitize HTML filter with tag/attribute whitelist and XSS protection by harrym 3 years, 9 months ago
  3. Cleanup dirty HTML from a WYSIWYG editor by denis 3 years, 11 months ago
  4. Replace Paragraph Tags for Flash by blackbrrr 4 years, 9 months ago
  5. Contact Form by henrikv 6 years, 2 months ago

Comments

st0w (on May 30, 2011):

Once change I had to make - despite trying to add

sanitize.is_safe = True

after the last line you had, Django was still interpreting the results of this as unsafe and proceeding to escape all the special characters. Which somewhat defeats the purpose of having a smart library like html5lib handle it. I had to add one import

from django.utils.safestring import mark_safe

and change your return statement to this:

return mark_safe(p.parseFragment(value).toxml())

Thanks for the snippet!

#

(Forgotten your password?)