Login

CategoriesField

Author:
fongandrew
Posted:
July 30, 2009
Language:
Python
Version:
1.1
Tags:
field categories binary
Score:
-1 (after 1 ratings)

If you have a relatively small finite number of categories (e.g. < 64), don't want to add a separate column for each one, and don't want to add an entire separate table and deal with joins, this field offers a simple solution.

You initialize the field with a list of categories. The categories can be any Python object. When saving a set of categories, the field converts the set to a binary number based on the indices of the categories list you passed in. So if you pass in as your categories list [A,B,C,D] and set model.categories = [C,D], the integer stored in the database is 0b1100 (note that although the categories list has a left-to-right index, the binary number grows right-to-left).

This means that if you change the order of your category list once you have data in the DB, you'll need to migrate the data over to the new format. Adding items to the list should be fine however.

If you need to do filtering based on this field, you can use bitwise operators. Django currently (to my knowledge) doesn't support this natively, but most DBs have some bitwise operator support and you can use the extra function for now. For example, this query will select CheeseShop models that have the 'Gouda' in its cheeses field.

CheeseShop.objects.all().extra(where=['cheeses | %s = cheeses'], params=[CHEESES.index('Gouda')])

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
from django.db import models

# Stores up to 64 categories (assuming MySQL 64-bit integers)
MAX_CATEGORIES = 64

class CategoriesField(models.Field):
    __metaclass__ = models.SubfieldBase
    
    def __init__(self, categories=None, *args, **kwds):
        self.categories = categories or []
        assert len(self.categories) < MAX_CATEGORIES, "Too many categories!"
        super(CategoriesField, self).__init__(*args, **kwds)
    
    def get_internal_type(self):
        return "IntegerField"
    
    def to_python(self, value):
        if not value:
            return set()
        if isinstance(value, int) or isinstance(value, long):
            cats = set()
            index = 0
            while value:
                if value % 2:
                    cats.add(self.categories[index])
                index += 1
                value = value >> 1
            return cats
        return value
    
    def get_db_prep_value(self, value):
        if not value:
            return 0
        if isinstance(value, int):
            return value
        value = set(value)
        db_value = 0
        for index, category in enumerate(self.categories):
            if category in value:
                db_value = db_value | (1 << index)
        return db_value


############# Tests ############
from django.test import TestCase

CHEESES = ['Red Leicester', 'Tilsit', 'Caerphilly', 'Bel Paese',
           'Illchester', 'Gouda', 'Venezuelan Beaver']

class CheeseShop(models.Model):
    cheeses = CategoriesField(CHEESES, blank=True)

class CategoriesFieldTests(TestCase):
    def setUp(self):
        self.cheeses = ['Venezuelan Beaver', 'Tilsit', 'Illchester']
        self.cheese_shop = CheeseShop(cheeses=self.cheeses)
        self.cheese_shop.save()
      
    def testDataIntegrity(self):
        """
        Tests that data remains the same when saved to database
        """
        self.assertEqual(set(self.cheese_shop.cheeses), set(self.cheeses))
    
    def testDataIntegrityFetch(self):
        """
        Tests that data remains the same when fetched from database
        """
        cheese_shop = CheeseShop.objects.get(pk=self.cheese_shop.pk)
        self.assertEqual(set(cheese_shop.cheeses), set(self.cheeses))

More like this

  1. Filter change list by a date range by aruseni 3 years, 1 month ago
  2. ByteSplitterField by Lacour 3 years, 7 months ago
  3. Effective content caching for mass-load site using redirect feature by nnseva 3 years, 8 months ago
  4. Lightweight querysets by sardarnl 1 year, 9 months ago
  5. RPN template math by durka 6 years, 5 months ago

Comments

gregb (on July 30, 2009):

I'm a bit baffled as to why you'd do this rather than just a simple Category model with ManyToMany relationship. Seems to be overly complicated and not really adding anything...

Have I missed something?

#

m0nonoke (on July 31, 2009):

What's wrong with using the standard field option choices?

#

Please login first before commenting.