Login

Sophisticated order_by sorting

Author:
mawi
Posted:
December 10, 2012
Language:
Python
Version:
1.4
Tags:
sort orderby sorting ordering
Score:
0 (after 0 ratings)

I wanted to sort a CharField which consists of digits in a different way. This field is a matricle number field (some kind of registration number for students. They have matricle numbers in the format YYxxxxxx - which means "YY" are the last two digits of the year they started studying.)

So I wanted to sort them in a way that they appear like this: 5000000, 5000001, ... , 9999998, 9999999, 0000000, 0000001, ... , 1200000, ... , 4999999

Took me some time to find out how to do this efficiently in PostgreSQL, and so I thought I'd share it here.

The important stuff is in the model "Candidate" to use a special "objects" object manager which uses a special QuerySet as well. Here lies the "magic": If there is a ordering required that contains "mnr", then a special on-the-fly calculated field will be added to the queryset called "mnr_specialsorted".

Now it is possible to do things like Candidate.objects.filter( firstname__contains="Pony" ).exclude( lastname__contains="Java" ).order_by("lastname", "-mnr")

For other database engines you might want to change the MNR_SORTER variable to fit your needs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# models.py
# (see the description on the right!)

MNR_SORTER = '''CASE WHEN ('1'||mnr)::integer
                    BETWEEN 15000000 AND 19999999 THEN mnr::integer
                ELSE ('1'||mnr)::integer
                END'''


class CandidateQuerySet(models.query.QuerySet):

    def order_by(self, *field_names):
        assert self.query.can_filter(), \
                "Cannot reorder a query once a slice has been taken."
        if 'mnr' in field_names or '-mnr' in field_names:
            new_field_names = []
            for field_name in field_names:
                if field_name in ('mnr', '-mnr'):
                    field_name = '%s_specialsorted' % field_name
                new_field_names.append(field_name)
            obj = self.extra(select={'mnr_specialsorted': MNR_SORTER})._clone()
        else:
            new_field_names = field_names
            obj = self._clone()
        obj.query.clear_ordering()
        obj.query.add_ordering(*new_field_names)
        return obj


class CandidateManager(models.Manager):

    def get_query_set(self):
        """Returns a new QuerySet object.  Subclasses can override this method
        to easily customize the behavior of the Manager.
        """
        return CandidateQuerySet(self.model, using=self._db)


class Candidate(models.Model):
    mnr = models.CharField(max_length=256)  # Matrikelnumer
    firstname = models.CharField(max_length=4096)
    lastname = models.CharField(max_length=4096)

    objects = CandidateManager()

######
# To make the search really fast, you need to create a functional index.
# For PostgreSQL this would look like this:
#
# create a file "MYAPP/sql/candidate_postgresql_psycopg2.sql" with this content:

DROP INDEX IF EXISTS candidates_mnr_specialsorted;
CREATE INDEX candidates_mnr_specialsorted ON candidates_candidate ( ( CASE WHEN ('1'||mnr)::integer
                    BETWEEN 15000000 AND 19999999 THEN mnr::integer
                ELSE ('1'||mnr)::integer
                END ) );

# This index will be (re)created upon a syncdb. It is also a good idea to do a "VACUUM ANALYZE" in a psql shell for PostgreSQL to recognise whether to use the index or not.

More like this

Comments

Please login first before commenting.