Login

Soft-wrap long lines

Author:
Ubercore
Posted:
April 3, 2008
Language:
Python
Version:
.96
Score:
0 (after 0 ratings)

This filter naively parses HTML content, and inserts <wbr/> tags in lines with unbroken strings longer than max_line_length characters. It leaves content inside tags alone, so that things like urls are unaltered. XHTML entities are treated as atomic, and whitespace is determined with a regex.

It assumes well formed HTML.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def softwraphtml(value, max_line_length=20):
    import re
    whitespace_re = re.compile('\s')
    new_value = []
    unbroken_chars = 0
    in_tag = False
    in_xhtml_entity = False
    for idx, char in enumerate(value):
        if char == '<':
            in_tag = True
        elif char == '>':
            in_tag = False
            unbroken_chars = 0
        elif char == '&' and not in_tag:
            in_xhtml_entity = True
        elif char == ';' and in_xhtml_entity:
            in_xhtml_entity = False            
        elif whitespace_re.match(char):
            unbroken_chars = 0
        
        new_value.append(char)
        if not in_xhtml_entity:
            if unbroken_chars >= max_line_length-1 and not in_tag:
                new_value.append("<wbr/>")
                unbroken_chars = 0
            else:
                unbroken_chars += 1
    return ''.join(new_value)

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 3 months, 1 week ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 3 months, 2 weeks ago
  3. Serializer factory with Django Rest Framework by julio 10 months, 1 week ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 11 months ago
  5. Help text hyperlinks by sa2812 11 months, 3 weeks ago

Comments

svetlyak (on April 4, 2008):

How about inserting of the <wbr/> in the xhtml entities? This may be a big issue if trying to output a correct xhtml code.

For example, any user can place a long URL containing & in the comments.

#

Ubercore (on April 6, 2008):

That's a very good point. As I said, it's very basic still, but you're right -- it should treat xhtml entities atomically. Thanks for pointing that out.

#

Please login first before commenting.