Automatically slugify slug fields in your models

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
import re
from django.db import models

class SlugNotCorrectlyPrePopulated(Exception): 
    pass 

def _string_to_slug(s):    
    raw_data = s
    # normalze string as proposed on http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/251871
    # by Aaron Bentley, 2006/01/02
    try:
        import unicodedata        
        raw_data = unicodedata.normalize('NFKD', raw_data.decode('utf-8', 'replace')).encode('ascii', 'ignore')
    except:
        pass
    return re.sub(r'[^a-z0-9-]+', '_', raw_data.lower()).strip('_')
    
# as proposed by Archatas (http://www.djangosnippets.org/users/Archatas/)
def _get_unique_value(model, proposal, field_name="slug", instance_pk=None, separator="-"):
    """ Returns unique string by the proposed one.
    Optionally takes:
    * field name which can  be 'slug', 'username', 'invoice_number', etc.
    * the primary key of the instance to which the string will be assigned.
    * separator which can be '-', '_', ' ', '', etc.
    By default, for proposal 'example' returns strings from the sequence:
        'example', 'example-2', 'example-3', 'example-4', ...
    """
    if instance_pk:
        similar_ones = model.objects.filter(**{field_name + "__startswith": proposal}).exclude(pk=instance_pk).values(field_name)
    else:
        similar_ones = model.objects.filter(**{field_name + "__startswith": proposal}).values(field_name)
    similar_ones = [elem[field_name] for elem in similar_ones]
    if proposal not in similar_ones:
        return proposal
    else:
        numbers = []
        for value in similar_ones:
            match = re.match(r'^%s%s(\d+)$' % (proposal, separator), value)
            if match:
                numbers.append(int(match.group(1)))
        if len(numbers)==0:
            return "%s%s2" % (proposal, separator)
        else:
            largest = sorted(numbers)[-1]
            return "%s%s%d" % (proposal, separator, largest + 1)

def _get_fields_and_data(model):
    opts = model._meta
    slug_fields = []
    for f in opts.fields:
        if isinstance(f, models.SlugField):
            if not f.prepopulate_from:
                raise SlugNotCorrectlyPrePopulated , "Slug for %s is not prepopulated" % f.name
            prepop = []
            for n in f.prepopulate_from:
                if not hasattr(model, n):
                    raise SlugNotCorrectlyPrePopulated , "Slug for %s is to be prepopulated from %s, yet %s.%s does not exist" % (f.name , n , type(model), n)
                else:
                    prepop.append(getattr(model, n))
            slug_fields.append([f , "_".join(prepop)])
    return slug_fields
    
def slugify(sender, instance, signal, *args, **kwargs):    
    for slugs in _get_fields_and_data(instance):    
        original_slug = _string_to_slug(slugs[1])
        slug = original_slug
        ct = 0;
        try:
            # See if object is new
            # To prevent altering urls, don't update slug on existing objects
            sender.objects.get(pk=instance._get_pk_val())
        except:
            slug = _get_unique_value(instance.__class__, slug, slugs[0].name, separator="_")
            setattr(instance, slugs[0].name, slug)


# ===========================
# To attach it to your model:
# ===========================
#
# dispatcher.connect(_package_.slugify, signal=signals.pre_save, sender=_your_model_)

More like this

  1. Improved Pickled Object Field by taavi223 4 years, 8 months ago
  2. Dynamic Models Revisited by Ben 6 years, 6 months ago
  3. autotranslatslugify by Ciantic 7 years ago
  4. Another means of updating a subset of a model's fields by insin 6 years, 5 months ago
  5. jQuery slugify plugin by girasquid 4 years, 11 months ago

Comments

Archatas (on March 11, 2007):

The problem with your code is that it accesses the database as many times as a number of existing slugs with the same beginning. For example, if there are 100 objects created by different users and called "test", "test_1", ... "test_99", then 101 DB queries will be performed to get the next free slug. Other not-nice-to-have feature is that numbering starts from 0 (is not humanized).

I suggest you to integrate the call of the following more generic function for getting the unique value for the slug:

import re

def get_unique_value(model, proposal, field_name="slug", instance_pk=None, separator="-"):
    """ Returns unique string by the proposed one.
    Optionally takes:
    * field name which can  be 'slug', 'username', 'invoice_number', etc.
    * the primary key of the instance to which the string will be assigned.
    * separator which can be '-', '_', ' ', '', etc.
    By default, for proposal 'example' returns strings from the sequence:
        'example', 'example-2', 'example-3', 'example-4', ...
    """
    if instance_pk:
        similar_ones = model.objects.filter(**{field_name + "__startswith": proposal}).exclude(pk=instance_pk).values(field_name)
    else:
        similar_ones = model.objects.filter(**{field_name + "__startswith": proposal}).values(field_name)
    similar_ones = [elem[field_name] for elem in similar_ones]
    if proposal not in similar_ones:
        return proposal
    else:
        numbers = []
        for value in similar_ones:
            match = re.match(r'^%s%s(\d+)$' % (proposal, separator), value)
            if match:
                numbers.append(int(match.group(1)))
        if len(numbers)==0:
            return "%s%s2" % (proposal, separator)
        else:
            largest = sorted(numbers)[-1]
            return "%s%s%d" % (proposal, separator, largest + 1)

I could create a new snippet for that, but I think it's more useful to have it here in one place.

Example usage:

from django.contrib.auth.models import User
from myapp.models import Page
unique_username = get_unique_value(User, "john_smith", field_name="username", separator="_")
unique_slug = get_unique_value(Page, "about-me")

#

Aliquip (on March 11, 2007):

Thanks, you're absolutely right. I never was overly concerned with performance as I required my titles to be unique anyways, but of course, the less db access the better ;)

As for starting numbering at two, thus counting the first unnumbered slug.. That's something I never even considered, I just avoided to start numbering at zero.. Again, as there is already an unnumbered no 1, this might be the preferable way to deal with it ;) A shame I didn't think of that earlier, now I already have about 1200 potential articles inhumanly slugified(though I doubt more that 5 slugs actually overlap, thus no real problem) :)

#

ubernostrum (on March 13, 2007):

Also, on the subject of underscores versus hyphens: Google will treat a hyphen in a URL as a word separator, but not an underscore, which means that hyphenated slugs tend to yield better search placement.

#

Ciantic (on April 30, 2007):

I can't get this work with newforms, with oldforms it works just fine. But apparently newforms dispatching or something, makes all my slugs contain unicode stuff and the slugifier above does not work on duplicate cases and throws just IntegrityError...

e.g. if I print (for debugging demonstration) slugs[1] variable at the slugify function defined in snippet above I get u'Jyv\xe4skyl\xe4' but if I try to edit this thing from the admin (that is still oldforms) I get 'Jyv\xc3\xa4skyl\xc3\xa4' and it works.

To try out this error, try to save something with the form.save() and with duplicate key.

#

Ciantic (on April 30, 2007):

Here is the fix:

    if type(raw_data) == type(u''):
        raw_data = unicodedata.normalize('NFKD', raw_data).encode('ascii', 'ignore')
    else:
        # Just for oldforms:
        raw_data = unicodedata.normalize('NFKD', raw_data.decode('utf-8', 'replace')).encode('ascii', 'ignore')

This fix is neccessary to make it work with old- and newforms at the sametime (like admin)... (it was not working with newforms.)

#

(Forgotten your password?)