Login

nofollow filter for external links

Author:
muhuk
Posted:
April 14, 2009
Language:
Python
Version:
1.0
Score:
1 (after 1 ratings)

Based on svetlyak's nofollow filter. This one processes only external URL's. Links with internal URL's will be returned unmodified.

There's one gotcha; I preferred to have false positives instead of false negatives. So, it will nofollow href="some/relative/path/" for example. If you must do relative paths do it like this:

<a href="./some/relative/path">link text</a>
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import re
from django import template
from django.utils.safestring import mark_safe


NOFOLLOW_RE = re.compile(u'<a (?![^>]*rel=["\']nofollow[\'"])' \
                         u'(?![^>]*href=["\']\.{0,2}/[^/])',
                         re.UNICODE|re.IGNORECASE)


register = template.Library()


def nofollow(content):
    return mark_safe(re.sub(NOFOLLOW_RE, u'<a rel="nofollow" ', content))
register.filter(nofollow)

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 1 year ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 1 year ago
  3. Serializer factory with Django Rest Framework by julio 1 year, 7 months ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 8 months ago
  5. Help text hyperlinks by sa2812 1 year, 8 months ago

Comments

masklinn (on April 14, 2009):

This one processes links with only URL's that start with http

fwiw that's completely broken, the scheme name (http:) is optional, it's perfectly legal for a URL to start with // which means "go to this (potentially external) URL using the current scheme".

Example: this goes to wikipedia, but there's no http in the url (you might have to check the source to see it though)

#

muhuk (on May 11, 2009):

@masklinn: Thanks for the input. I have updated the regex, check the description. I'd like to know what you think about this variation.

#

Please login first before commenting.