Soft-wrap long lines

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def softwraphtml(value, max_line_length=20):
    import re
    whitespace_re = re.compile('\s')
    new_value = []
    unbroken_chars = 0
    in_tag = False
    in_xhtml_entity = False
    for idx, char in enumerate(value):
        if char == '<':
            in_tag = True
        elif char == '>':
            in_tag = False
            unbroken_chars = 0
        elif char == '&' and not in_tag:
            in_xhtml_entity = True
        elif char == ';' and in_xhtml_entity:
            in_xhtml_entity = False            
        elif whitespace_re.match(char):
            unbroken_chars = 0
        
        new_value.append(char)
        if not in_xhtml_entity:
            if unbroken_chars >= max_line_length-1 and not in_tag:
                new_value.append("<wbr/>")
                unbroken_chars = 0
            else:
                unbroken_chars += 1
    return ''.join(new_value)

More like this

  1. Auto HTML Linebreak filter by punteney 5 years, 1 month ago
  2. Updated version of StripWhitespaceMiddleware (v1.1) by sleepycal 1 year, 11 months ago
  3. Django filter stack to cleanup WYSIWYG output by jbergantine 1 year, 8 months ago
  4. make an unordered html list by techiegurl 4 years, 12 months ago
  5. HTML to text filter by MasonM 4 years, 9 months ago

Comments

svetlyak (on April 4, 2008):

How about inserting of the [HTML_REMOVED] in the xhtml entities? This may be a big issue if trying to output a correct xhtml code.

For example, any user can place a long URL containing [HTML_REMOVED] in the comments.

#

Ubercore (on April 6, 2008):

That's a very good point. As I said, it's very basic still, but you're right -- it should treat xhtml entities atomically. Thanks for pointing that out.

#

(Forgotten your password?)