Login

convert plain text to html

Author:
limodou
Posted:
February 25, 2007
Language:
Python
Version:
Pre .96
Tags:
text
Score:
2 (after 2 ratings)

Convert plain text to html. For example:

text="""aabhttp://www.example.com http://www.example.com

http://www.example.com <<< [<aaaa></aaaa>] """ print plaintext2html(text)

It can convert url text to html href. And it also can convert space to &nbsp;. So if you paste python code, the indent will not lost at all. Default it'll convert t to 4 spaces.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import re
import cgi

re_string = re.compile(r'(?P<htmlchars>[<&>])|(?P<space>^[ \t]+)|(?P<lineend>\r\n|\r|\n)|(?P<protocal>(^|\s)((http|ftp)://.*?))(\s|$)', re.S|re.M|re.I)
def plaintext2html(text, tabstop=4):
    def do_sub(m):
        c = m.groupdict()
        if c['htmlchars']:
            return cgi.escape(c['htmlchars'])
        if c['lineend']:
            return '<br>'
        elif c['space']:
            t = m.group().replace('\t', '&nbsp;'*tabstop)
            t = t.replace(' ', '&nbsp;')
            return t
        elif c['space'] == '\t':
            return ' '*tabstop;
        else:
            url = m.group('protocal')
            if url.startswith(' '):
                prefix = ' '
                url = url[1:]
            else:
                prefix = ''
            last = m.groups()[-1]
            if last in ['\n', '\r', '\r\n']:
                last = '<br>'
            return '%s<a href="%s">%s</a>%s' % (prefix, url, url, last)
    return re.sub(re_string, do_sub, text)

More like this

  1. YUI Loader as Django middleware by akaihola 7 years, 2 months ago
  2. HTML to text filter by MasonM 6 years, 11 months ago
  3. MarkupTextField by myles 6 years, 8 months ago
  4. Auto HTML Linebreak filter by punteney 7 years, 2 months ago
  5. pyText2Pdf - Python script to convert plain text into PDF file. Modified to work with streams. by vsergeyev 5 years, 8 months ago

Comments

Hugotox (on July 20, 2012):

you have a semicolon at the end of line 17. Seems like a syntax error

#

Please login first before commenting.