djangosnippets: Soft-wrap long lines

Author:: Ubercore
Posted:: April 3, 2008
Language:: Python
Version:: .96
Score:: 0 (after 0 ratings)

Download
Raw

This filter naively parses HTML content, and inserts <wbr/> tags in lines with unbroken strings longer than max_line_length characters. It leaves content inside tags alone, so that things like urls are unaltered. XHTML entities are treated as atomic, and whitespace is determined with a regex.

It assumes well formed HTML.

def softwraphtml(value, max_line_length=20):
    import re
    whitespace_re = re.compile('\s')
    new_value = []
    unbroken_chars = 0
    in_tag = False
    in_xhtml_entity = False
    for idx, char in enumerate(value):
        if char == '<':
            in_tag = True
        elif char == '>':
            in_tag = False
            unbroken_chars = 0
        elif char == '&' and not in_tag:
            in_xhtml_entity = True
        elif char == ';' and in_xhtml_entity:
            in_xhtml_entity = False            
        elif whitespace_re.match(char):
            unbroken_chars = 0
        
        new_value.append(char)
        if not in_xhtml_entity:
            if unbroken_chars >= max_line_length-1 and not in_tag:
                new_value.append("<wbr/>")
                unbroken_chars = 0
            else:
                unbroken_chars += 1
    return ''.join(new_value)

Comments

svetlyak (on April 4, 2008):

How about inserting of the <wbr/> in the xhtml entities? This may be a big issue if trying to output a correct xhtml code.

For example, any user can place a long URL containing & in the comments.

Ubercore (on April 6, 2008):

That's a very good point. As I said, it's very basic still, but you're right -- it should treat xhtml entities atomically. Thanks for pointing that out.

Please login first before commenting.

Soft-wrap long lines

More like this

Comments