This filter naively parses HTML content, and inserts <wbr/> tags in lines with unbroken strings longer than max_line_length characters. It leaves content inside tags alone, so that things like urls are unaltered. XHTML entities are treated as atomic, and whitespace is determined with a regex.
It assumes well formed HTML.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | def softwraphtml(value, max_line_length=20):
import re
whitespace_re = re.compile('\s')
new_value = []
unbroken_chars = 0
in_tag = False
in_xhtml_entity = False
for idx, char in enumerate(value):
if char == '<':
in_tag = True
elif char == '>':
in_tag = False
unbroken_chars = 0
elif char == '&' and not in_tag:
in_xhtml_entity = True
elif char == ';' and in_xhtml_entity:
in_xhtml_entity = False
elif whitespace_re.match(char):
unbroken_chars = 0
new_value.append(char)
if not in_xhtml_entity:
if unbroken_chars >= max_line_length-1 and not in_tag:
new_value.append("<wbr/>")
unbroken_chars = 0
else:
unbroken_chars += 1
return ''.join(new_value)
|
More like this
- Template tag - list punctuation for a list of items by shapiromatron 3 months, 1 week ago
- JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 3 months, 2 weeks ago
- Serializer factory with Django Rest Framework by julio 10 months, 1 week ago
- Image compression before saving the new model / work with JPG, PNG by Schleidens 11 months ago
- Help text hyperlinks by sa2812 11 months, 3 weeks ago
Comments
How about inserting of the <wbr/> in the xhtml entities? This may be a big issue if trying to output a correct xhtml code.
For example, any user can place a long URL containing & in the comments.
#
That's a very good point. As I said, it's very basic still, but you're right -- it should treat xhtml entities atomically. Thanks for pointing that out.
#
Please login first before commenting.