Login

nofollow filter for external links

Author:
muhuk
Posted:
April 14, 2009
Language:
Python
Version:
1.0
Tags:
filter nofollow
Score:
1 (after 1 ratings)

Based on svetlyak's nofollow filter. This one processes only external URL's. Links with internal URL's will be returned unmodified.

There's one gotcha; I preferred to have false positives instead of false negatives. So, it will nofollow href="some/relative/path/" for example. If you must do relative paths do it like this:

<a href="./some/relative/path">link text</a>
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import re
from django import template
from django.utils.safestring import mark_safe


NOFOLLOW_RE = re.compile(u'<a (?![^>]*rel=["\']nofollow[\'"])' \
                         u'(?![^>]*href=["\']\.{0,2}/[^/])',
                         re.UNICODE|re.IGNORECASE)


register = template.Library()


def nofollow(content):
    return mark_safe(re.sub(NOFOLLOW_RE, u'<a rel="nofollow" ', content))
register.filter(nofollow)

More like this

  1. better paginator template tag by amitu 6 years, 4 months ago
  2. True links in the admin list by svetlyak 7 years, 6 months ago
  3. Allow filtering and ordering by counts of related query results by exogen 7 years, 10 months ago
  4. Silently-failing include tag by brutasse 4 years, 8 months ago
  5. Link filter by sean2000 7 years, 2 months ago

Comments

masklinn (on April 14, 2009):

This one processes links with only URL's that start with http

fwiw that's completely broken, the scheme name (http:) is optional, it's perfectly legal for a URL to start with // which means "go to this (potentially external) URL using the current scheme".

Example: this goes to wikipedia, but there's no http in the url (you might have to check the source to see it though)

#

muhuk (on May 11, 2009):

@masklinn: Thanks for the input. I have updated the regex, check the description. I'd like to know what you think about this variation.

#

Please login first before commenting.