Login

CompressedTextField

Author:
Digitalxero
Posted:
May 8, 2010
Language:
Python
Version:
1.1
Score:
1 (after 1 ratings)

I found my self doing data migration for a client, and also found that their old system (CSV) had 4 text fields that were 2 to 4MB each and after about 2 minutes of hammering the mysql server as I was parsing this and trying to insert the data the server would drop all connections. In my testing I found that if I didnt send those 4 fields the mysql server was happy to let me migrate all my data all (240GB of it). So I started thinking, "I should just store these fields compress anyways, a little over head to render the data, but thats fine by me."

So thus was born a CompressedTextField. It bz2 compresses the contents then does a base64 encode to play nice with the server storage of text fields. Once I started using this my data migration, though a bit slower then without the field data, ran along all happy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class CompressedTextField(models.TextField):
    """
    model Fields for storing text in a compressed format (bz2 by default)
    """
    __metaclass__ = models.SubfieldBase

    def to_python(self, value):
        if not value:
            return value

        try:
            return value.decode('base64').decode('bz2').decode('utf-8')
        except Exception:
            return value

    def get_prep_value(self, value):
        if not value:
            return value

        try:
            value.decode('base64')
            return value
        except Exception:
            try:
                tmp = value.encode('utf-8').encode('bz2').encode('base64')
            except Exception:
                return value
            else:
                if len(tmp) > len(value):
                    return value

                return tmp

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 11 months, 2 weeks ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 11 months, 3 weeks ago
  3. Serializer factory with Django Rest Framework by julio 1 year, 6 months ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 7 months ago
  5. Help text hyperlinks by sa2812 1 year, 7 months ago

Comments

eracle (on July 25, 2017):

Hello! What python/django versions was it tested with?

#

Please login first before commenting.