Login

CategoriesField

Author:
fongandrew
Posted:
July 30, 2009
Language:
Python
Version:
1.1
Score:
-1 (after 1 ratings)

If you have a relatively small finite number of categories (e.g. < 64), don't want to add a separate column for each one, and don't want to add an entire separate table and deal with joins, this field offers a simple solution.

You initialize the field with a list of categories. The categories can be any Python object. When saving a set of categories, the field converts the set to a binary number based on the indices of the categories list you passed in. So if you pass in as your categories list [A,B,C,D] and set model.categories = [C,D], the integer stored in the database is 0b1100 (note that although the categories list has a left-to-right index, the binary number grows right-to-left).

This means that if you change the order of your category list once you have data in the DB, you'll need to migrate the data over to the new format. Adding items to the list should be fine however.

If you need to do filtering based on this field, you can use bitwise operators. Django currently (to my knowledge) doesn't support this natively, but most DBs have some bitwise operator support and you can use the extra function for now. For example, this query will select CheeseShop models that have the 'Gouda' in its cheeses field.

CheeseShop.objects.all().extra(where=['cheeses | %s = cheeses'], params=[CHEESES.index('Gouda')])

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
from django.db import models

# Stores up to 64 categories (assuming MySQL 64-bit integers)
MAX_CATEGORIES = 64

class CategoriesField(models.Field):
    __metaclass__ = models.SubfieldBase
    
    def __init__(self, categories=None, *args, **kwds):
        self.categories = categories or []
        assert len(self.categories) < MAX_CATEGORIES, "Too many categories!"
        super(CategoriesField, self).__init__(*args, **kwds)
    
    def get_internal_type(self):
        return "IntegerField"
    
    def to_python(self, value):
        if not value:
            return set()
        if isinstance(value, int) or isinstance(value, long):
            cats = set()
            index = 0
            while value:
                if value % 2:
                    cats.add(self.categories[index])
                index += 1
                value = value >> 1
            return cats
        return value
    
    def get_db_prep_value(self, value):
        if not value:
            return 0
        if isinstance(value, int):
            return value
        value = set(value)
        db_value = 0
        for index, category in enumerate(self.categories):
            if category in value:
                db_value = db_value | (1 << index)
        return db_value


############# Tests ############
from django.test import TestCase

CHEESES = ['Red Leicester', 'Tilsit', 'Caerphilly', 'Bel Paese',
           'Illchester', 'Gouda', 'Venezuelan Beaver']

class CheeseShop(models.Model):
    cheeses = CategoriesField(CHEESES, blank=True)

class CategoriesFieldTests(TestCase):
    def setUp(self):
        self.cheeses = ['Venezuelan Beaver', 'Tilsit', 'Illchester']
        self.cheese_shop = CheeseShop(cheeses=self.cheeses)
        self.cheese_shop.save()
      
    def testDataIntegrity(self):
        """
        Tests that data remains the same when saved to database
        """
        self.assertEqual(set(self.cheese_shop.cheeses), set(self.cheeses))
    
    def testDataIntegrityFetch(self):
        """
        Tests that data remains the same when fetched from database
        """
        cheese_shop = CheeseShop.objects.get(pk=self.cheese_shop.pk)
        self.assertEqual(set(cheese_shop.cheeses), set(self.cheeses))

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 3 months, 1 week ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 3 months, 2 weeks ago
  3. Serializer factory with Django Rest Framework by julio 10 months, 2 weeks ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 11 months ago
  5. Help text hyperlinks by sa2812 12 months ago

Comments

gregb (on July 30, 2009):

I'm a bit baffled as to why you'd do this rather than just a simple Category model with ManyToMany relationship. Seems to be overly complicated and not really adding anything...

Have I missed something?

#

m0nonoke (on July 31, 2009):

What's wrong with using the standard field option choices?

#

Please login first before commenting.