Inspired by this terse blog post.
This filter was designed to simplify the stripping out of all (x)html in a given template var, while preserving some meta information from anchor, and image tags.
Why is this even useful? If you have pre-assembled portions of templates, or model fields containing html, that you want to use to populate a search index like django-haystack you can safely discard all the markup, while keeping the text that should be still searchable. Alt text, and title attributes are worth keeping!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | from django import template
register = template.Library()
from django.template.defaultfilters import stringfilter
from django.utils.safestring import mark_safe
from BeautifulSoup import BeautifulSoup, Tag, NavigableString
@register.filter(name='plaintext')
@stringfilter
def plaintext(value):
soup = BeautifulSoup(value)
anchors = soup.findAll('a')
for a in anchors:
substitute = Tag(soup, 'span')
substitute.insert(0,a.string)
meta = []
attrs = [k for k,v in a.attrs]
if 'title' in attrs: meta.append(a['title'])
if 'href' in attrs: meta.append(a['href'])
if meta: substitute.insert(1,NavigableString(' (%s)' % ', '.join(meta)))
a.replaceWith(substitute)
images = soup.findAll('img')
for img in images:
substitute = Tag(soup,'span')
meta = []
attrs = [k for k,v in img.attrs]
if 'src' in attrs: meta.append(img['src'])
if 'title' in attrs: meta.append(img['title'])
if 'alt' in attrs: meta.append(img['alt'])
if meta: substitute.insert(0,NavigableString(' (%s)' % ', '.join(meta)))
img.replaceWith(substitute)
return mark_safe(''.join(soup.findAll(text=True)))
plaintext.mark_safe = True
|
More like this
- Template tag - list punctuation for a list of items by shapiromatron 11 months, 2 weeks ago
- JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 11 months, 3 weeks ago
- Serializer factory with Django Rest Framework by julio 1 year, 6 months ago
- Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 7 months ago
- Help text hyperlinks by sa2812 1 year, 7 months ago
Comments
Please login first before commenting.