nofollow filter for external links

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import re
from django import template
from django.utils.safestring import mark_safe


NOFOLLOW_RE = re.compile(u'<a (?![^>]*rel=["\']nofollow[\'"])' \
                         u'(?![^>]*href=["\']\.{0,2}/[^/])',
                         re.UNICODE|re.IGNORECASE)


register = template.Library()


def nofollow(content):
    return mark_safe(re.sub(NOFOLLOW_RE, u'<a rel="nofollow" ', content))
register.filter(nofollow)

More like this

  1. Search results pagination by polarbear 6 years, 1 month ago
  2. filter for extracting a number of paragraphs from any HTML code by rafadev 1 year, 11 months ago
  3. nofollow filter by svetlyak 5 years, 10 months ago
  4. Add rel=lightbox to all image-links by bartTC 5 years, 4 months ago
  5. encode_mailto by santuri 6 years ago

Comments

masklinn (on April 14, 2009):

This one processes links with only URL's that start with http

fwiw that's completely broken, the scheme name (http:) is optional, it's perfectly legal for a URL to start with // which means "go to this (potentially external) URL using the current scheme".

Example: this goes to wikipedia, but there's no http in the url (you might have to check the source to see it though)

#

muhuk (on May 11, 2009):

@masklinn: Thanks for the input. I have updated the regex, check the description. I'd like to know what you think about this variation.

#

(Forgotten your password?)