I suppose I'm kind of stubborn, but I prefer to use underscores to replace spaces and other characters. Of course, that shouldn't hold you back from using the build-in slugify filter :)
Forcing the slug to use ASCII equivalents:
Transforming titles like "Äës" to slugs like "aes" was kind of a trial and error job. It now works for me. I hope _string_to_slug(s):
proves a rather stable solution. Yet the worst-case scenario is that such characters are lost, I guess that is acceptable.
Other ways of dealing with this problem can be found at Latin1 to ASCII at Activestate or in the comments below.
How to use:
The slug fields in your model must have prepopulate_from set, the fields specified in it are used to build the slug.
To prevent duplicates, a number is added to the slug if the slug already exists for the current field in another, previous, object. I guess there should be a cleaner way to distinguish between creating a new db entry or updating an existing one, sadly, the db back-end is kind of a black-box to me. At least this works ;)
I choose not to alter the slug on an update to keep urls more bookmarkable. You could even extend this further by only updating the slug field if it hasn't been assigned a value.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | import re
from django.db import models
class SlugNotCorrectlyPrePopulated(Exception):
pass
def _string_to_slug(s):
raw_data = s
# normalze string as proposed on http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/251871
# by Aaron Bentley, 2006/01/02
try:
import unicodedata
raw_data = unicodedata.normalize('NFKD', raw_data.decode('utf-8', 'replace')).encode('ascii', 'ignore')
except:
pass
return re.sub(r'[^a-z0-9-]+', '_', raw_data.lower()).strip('_')
# as proposed by Archatas (http://www.djangosnippets.org/users/Archatas/)
def _get_unique_value(model, proposal, field_name="slug", instance_pk=None, separator="-"):
""" Returns unique string by the proposed one.
Optionally takes:
* field name which can be 'slug', 'username', 'invoice_number', etc.
* the primary key of the instance to which the string will be assigned.
* separator which can be '-', '_', ' ', '', etc.
By default, for proposal 'example' returns strings from the sequence:
'example', 'example-2', 'example-3', 'example-4', ...
"""
if instance_pk:
similar_ones = model.objects.filter(**{field_name + "__startswith": proposal}).exclude(pk=instance_pk).values(field_name)
else:
similar_ones = model.objects.filter(**{field_name + "__startswith": proposal}).values(field_name)
similar_ones = [elem[field_name] for elem in similar_ones]
if proposal not in similar_ones:
return proposal
else:
numbers = []
for value in similar_ones:
match = re.match(r'^%s%s(\d+)$' % (proposal, separator), value)
if match:
numbers.append(int(match.group(1)))
if len(numbers)==0:
return "%s%s2" % (proposal, separator)
else:
largest = sorted(numbers)[-1]
return "%s%s%d" % (proposal, separator, largest + 1)
def _get_fields_and_data(model):
opts = model._meta
slug_fields = []
for f in opts.fields:
if isinstance(f, models.SlugField):
if not f.prepopulate_from:
raise SlugNotCorrectlyPrePopulated , "Slug for %s is not prepopulated" % f.name
prepop = []
for n in f.prepopulate_from:
if not hasattr(model, n):
raise SlugNotCorrectlyPrePopulated , "Slug for %s is to be prepopulated from %s, yet %s.%s does not exist" % (f.name , n , type(model), n)
else:
prepop.append(getattr(model, n))
slug_fields.append([f , "_".join(prepop)])
return slug_fields
def slugify(sender, instance, signal, *args, **kwargs):
for slugs in _get_fields_and_data(instance):
original_slug = _string_to_slug(slugs[1])
slug = original_slug
ct = 0;
try:
# See if object is new
# To prevent altering urls, don't update slug on existing objects
sender.objects.get(pk=instance._get_pk_val())
except:
slug = _get_unique_value(instance.__class__, slug, slugs[0].name, separator="_")
setattr(instance, slugs[0].name, slug)
# ===========================
# To attach it to your model:
# ===========================
#
# dispatcher.connect(_package_.slugify, signal=signals.pre_save, sender=_your_model_)
|
More like this
- Template tag - list punctuation for a list of items by shapiromatron 10 months, 1 week ago
- JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 10 months, 2 weeks ago
- Serializer factory with Django Rest Framework by julio 1 year, 5 months ago
- Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 6 months ago
- Help text hyperlinks by sa2812 1 year, 6 months ago
Comments
The problem with your code is that it accesses the database as many times as a number of existing slugs with the same beginning. For example, if there are 100 objects created by different users and called "test", "test_1", ... "test_99", then 101 DB queries will be performed to get the next free slug. Other not-nice-to-have feature is that numbering starts from 0 (is not humanized).
I suggest you to integrate the call of the following more generic function for getting the unique value for the slug:
I could create a new snippet for that, but I think it's more useful to have it here in one place.
Example usage:
#
Thanks, you're absolutely right. I never was overly concerned with performance as I required my titles to be unique anyways, but of course, the less db access the better ;)
As for starting numbering at two, thus counting the first unnumbered slug.. That's something I never even considered, I just avoided to start numbering at zero.. Again, as there is already an unnumbered no 1, this might be the preferable way to deal with it ;) A shame I didn't think of that earlier, now I already have about 1200 potential articles inhumanly slugified(though I doubt more that 5 slugs actually overlap, thus no real problem) :)
#
Also, on the subject of underscores versus hyphens: Google will treat a hyphen in a URL as a word separator, but not an underscore, which means that hyphenated slugs tend to yield better search placement.
#
I can't get this work with newforms, with oldforms it works just fine. But apparently newforms dispatching or something, makes all my slugs contain unicode stuff and the slugifier above does not work on duplicate cases and throws just IntegrityError...
e.g. if I print (for debugging demonstration)
slugs[1]
variable at the slugify function defined in snippet above I getu'Jyv\xe4skyl\xe4'
but if I try to edit this thing from the admin (that is still oldforms) I get'Jyv\xc3\xa4skyl\xc3\xa4'
and it works.To try out this error, try to save something with the
form.save()
and with duplicate key.#
Here is the fix:
This fix is neccessary to make it work with old- and newforms at the sametime (like admin)... (it was not working with newforms.)
#
Please login first before commenting.