Login

keeptags: strip all HTML tags from output except a specified list of elements

Author:
chrominance
Posted:
June 24, 2007
Language:
Python
Version:
.96
Score:
5 (after 5 ratings)

Django has several filters designed to sanitize HTML output, but they're either too broad (striptags, escape) or too narrow (removetags) to use when you want to allow a specified set of HTML tags in your output. Thus keeptags was born. Some of the code is essentially ripped from the Django removetags function. It's not perfect--for example, it doesn't touch attributes inside elements at all--but otherwise it works well.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
def keeptags(value, tags):
    """
    Strips all [X]HTML tags except the space seperated list of tags 
    from the output.
    
    Usage: keeptags:"strong em ul li"
    """
    import re
    from django.utils.html import strip_tags, escape
    tags = [re.escape(tag) for tag in tags.split()]
    tags_re = '(%s)' % '|'.join(tags)
    singletag_re = re.compile(r'<(%s\s*/?)>' % tags_re)
    starttag_re = re.compile(r'<(%s)(\s+[^>]+)>' % tags_re)
    endtag_re = re.compile(r'<(/%s)>' % tags_re)
    value = singletag_re.sub('##~~~\g<1>~~~##', value)
    value = starttag_re.sub('##~~~\g<1>\g<3>~~~##', value)
    value = endtag_re.sub('##~~~\g<1>~~~##', value)
    value = strip_tags(value)
    value = escape(value)
    recreate_re = re.compile('##~~~([^~]+)~~~##')
    value = recreate_re.sub('<\g<1>>', value)
    return value

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 2 months, 2 weeks ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 2 months, 3 weeks ago
  3. Serializer factory with Django Rest Framework by julio 9 months, 2 weeks ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 10 months, 1 week ago
  5. Help text hyperlinks by sa2812 11 months ago

Comments

Please login first before commenting.