Sometimes it's nice to know if your visitors came from a search engine such as Google or Yahoo. I use this as a component in determining the popularity of blog articles for search keywords, so I know which articles to automatically feature or suggest. Maybe you'd like to use it to highlight the visitor's search terms on the page.
This isn't actually middleware in the sense that it defines any of the middleware interfaces. It's intended to be the base for a middleware method that you write, depending on what you want to happen when you detect that the visitor came from a search engine. In the example process_view
method, I detect when the request is going to use the object_detail
view and log the search query for that object in the database.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | import urlparse
import cgi
import re
class SearchReferrerMiddleware(object):
SEARCH_PARAMS = {
'AltaVista': 'q',
'Ask': 'q',
'Google': 'q',
'Live': 'q',
'Lycos': 'query',
'MSN': 'q',
'Yahoo': 'p',
}
NETWORK_RE = r"""^
(?P<subdomain>[-.a-z\d]+\.)?
(?P<engine>%s)
(?P<top_level>(?:\.[a-z]{2,3}){1,2})
(?P<port>:\d+)?
$(?ix)"""
@classmethod
def parse_search(cls, url):
"""
Extract the search engine, domain, and search term from `url`
and return them as (engine, domain, term). For example,
('Google', 'www.google.co.uk', 'django framework'). Note that
the search term will be converted to lowercase and have normalized
spaces.
The first tuple item will be None if the referrer is not a
search engine.
"""
try:
parsed = urlparse.urlsplit(url)
network = parsed[1]
query = parsed[3]
except (AttributeError, IndexError):
return (None, None, None)
for engine, param in cls.SEARCH_PARAMS.iteritems():
match = re.match(cls.NETWORK_RE % engine, network)
if match and match.group(2):
term = cgi.parse_qs(query).get(param)
if term and term[0]:
term = ' '.join(term[0].split()).lower()
return (engine, network, term)
return (None, network, None)
# Here's where your code goes!
# It can be any middleware method that needs search engine detection
# functionality... this is just my example.
def process_view(self, request, view_func, view_args, view_kwargs):
from django.views.generic.date_based import object_detail
referrer = request.META.get('HTTP_REFERER')
engine, domain, term = self.parse_search(referrer)
if engine and view_func is object_detail:
# The client got to this object's page from a search engine.
# This might be useful for determining the object's popularity.
# Get the object using object_detail's queryset.
# Log this search using a custom Visit model or something.
|
More like this
- Template tag - list punctuation for a list of items by shapiromatron 10 months, 2 weeks ago
- JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 10 months, 2 weeks ago
- Serializer factory with Django Rest Framework by julio 1 year, 5 months ago
- Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 6 months ago
- Help text hyperlinks by sa2812 1 year, 6 months ago
Comments
On line 42 replace: match = re.match(NETWORK_RE % engine, network)
with: match = re.match(cls.NETWORK_RE % engine, network)
#
@zenx: Fixed, thanks!
#
Thanks, this worked nicely.
#
We've been using this and after a very long and painful bug hunt have discovered there are couple of tweaks you might want to make.
You'll probably want to edit line 44 to make it "parse_qs(unicode(query))" and line 46 to make it "u' '.join" otherwise you may spend a while trying to work out why your template is throwing a DjangoUnicodeError!
#
How to use this code ? And I need to google referrer code.
#
Please login first before commenting.