Login

Lightweight querysets

Author:
sardarnl
Posted:
June 17, 2013
Language:
Python
Version:
Not specified
Score:
0 (after 0 ratings)

Suppose you have two models A and B. The models have lots of large text/json fields and other heavy data. Your view is showing some list of B.select_related(A), however, it is using just few lightweight fields. You want to defer all heavyweight fields in order to offload your DB and limit its traffic, but it is a tedious error prone work, that smells a bit with over-optimization.

Real example, you have Product and Category objects. Both have large description fields with lots of text data. However, these fields are only needed on product detail page or on category detail page. In other cases (eg. side menu, some product suggestion block etc) you just need few lightweight things.

Solution: add LightweightMixin to your models manager and specify your heavyweight_fields and always_related_models.

# all visible products with necessary relations prefetched and heavyweight
# fields deferred.
qs = Product.objects.lightweight()

# custom queryset with default defers applied. description and ingredients fields will
# be normally deferred, but in this case they are explicitly selected
qs = Product.objects.filter(my_filter="that", makes="sense")
qs = Product.objects.as_lightweight(qs, select=('description', 'ingredients'))

The best way to work with this snippet is to add the mixin to all managers in order to use lightweight() and public() calls. The as_public() is intended for your visibility, select_related() and batch_select() queryset settings. When you need full blown object, use public() queryset. When you need a lightweight list of objects, use lightweight() queryset.

When your application is completed, check the database querylog for frequent unnecessary large and ugly queries. Group all queries that were made by lightweight() querysets, make a list of unnecessary heavy fields and add them to manager's heavyweight_fields. If your as_public() uses select_related() joining heavy objects, then you can also specify always_related_models to defer some fields of these relations too.

Why? Because database IO will become your major bottleneck someday, just because you fetch 2Mb of data in order to render some tiny menu. With proper caching this is not a major issue, but proper queries may be sign of proper coding.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
class LightweightMixin(object):
    """ Manager mixin for lightweight functionality. """
    # heavy fields to defer in lightweight querysets
    heavyweight_fields = ()
    # relations that are always fetched with lightweight queries
    always_related_models = {}

    def as_public(self, qs, **kwargs):
        """ Apply visibility filter and relation prefetch on given QuerySet/Batch. """
        return qs

    def public(self, **kwargs):
        """ Returns default queryset with `as_public()` applied. """
        return self.as_public(self.all(), **kwargs)

    def lightweight(self, select=None, relations=None, **kwargs):
        """ Returns public() queryset with `as_lightweight()` applied. """
        return self.as_lightweight(self.public(**kwargs), select=select, relations=relations)

    def as_lightweight(self, qs, prefix='', select=None, relations=None):
        """
            Defer heavy-weight fields in given queryset.

            By default this method defers heavyweight fields in all relations, which
            are specified in `always_related_models`. Set `relations` to `False`
            to disable this behavior. You may override default relations by
            passing `dict` to `relations` argument.

            :Arguments:
                - `qs` (`QuerySet`): queryset to defer
                - `prefix` (str): object prefix, used in recursive defers
                - `select` (list of str): heavy fields that will be not defered
                - `relations` (dict or bool): recursively apply itself to relations.
        """
        defer = []
        select = set(select) if select else ()
        for fld in self.heavyweight_fields:
            pat = "%s%s" % (prefix, fld)
            if pat not in select:
                defer.append(pat)

        qs = qs.defer(*defer)
        if relations is not False: # by default always applies with default relations
            qs = self.as_lightweight_relations(qs, prefix=prefix, select=select, relations=relations)

        return qs

    def as_lightweight_relations(self, qs, prefix='', select=None, relations=None):
        """ Propogate defer calls to relations. """
        relations = relations or self.always_related_models
        for fld, model in relations.iteritems():
            relpref = "%s%s__" % (prefix, fld)
            qs = model.objects.as_lightweight(qs, relpref, select=select)

        return qs

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 10 months, 1 week ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 10 months, 2 weeks ago
  3. Serializer factory with Django Rest Framework by julio 1 year, 5 months ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 6 months ago
  5. Help text hyperlinks by sa2812 1 year, 6 months ago

Comments

Please login first before commenting.