When you want to save integers to the db, you usually have the choice between 16-, 32- and 64-bit Integers (also 8- and 24-bit for MySQL). If that doesn't fit your needs and you want to use your db-memory more efficient, this field might be handy to you. Imagine you have 3 numbers, but need only 10 bit to encode each (i.e. from 0 to 1000). Instead of creating 3 smallint-fields (48 bit), you can create one 'ByteSplitterField' which implements 3 'subfields' and automatically encodes them inside a 32 bit integer. You don't have to take care how each 10-bit chunk is encoded into the 32-bit integer, it's all handled by the field (see also field's description). Additionally, the Field offers opportunity to use decimal_places for each of your subfields. These are 'binary decimal places', meaning the integer-content is automatically divided by 2, 4, 8, etc. when you fetch the value from the field. You can also specify how values are rounded ('round' parameter) and what happens when you try to save a value out of range ('overflow' parameter)
Not implemented (maybe in the future if I should need it sometime):
- signed values. All values are positive right now!
- real (10-based) decimal places (actually you could probably directly use DecimalFields here)
- further space optimization, i.e. saving into CharField that's length can be chosen byte-wise
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | from __future__ import division
import decimal
from django.core import exceptions
from django.db import models
class ByteSplitterField(models.IntegerField):
description = """
A field that stores multiple (positive) number values inside virtual 'subfields'
of an IntegerField. These numbers can be integers or decimals with a defined
precision of n binary digits, giving a precision of 2 ** (-n).
Remember that you will NOT be able to select or filter by any of the subfields!
You can also only save the whole bunch of fields, not single subfields to the db.
Configure Field with keywords:
- 'subfield_names' (default=['value']): iterable of names for your subfields.
- 'subfield_lengths (default=[64])': iterable of length in bits for each of the fields
- 'subfield_decimal_places' (default=[0, ..., 0]): decimal-precision
for each of the fields, omit to set 0 for each field (=int-field)
- 'round' (default=int): choose how to save floats: floor-round <int> or <round>
- 'overflow' (default='error'): choose what happens with out-of-range-values:
raise 'error', wrap value in case of 'overflow' or 'truncate'
Usage example of a field that stores 3 numbers from [0, 8[ with precison 1, 0.5 and 0.25
and otherfield that stores a single value with 3 binary digits (precision 0.125):
>>> class MyModel(models.Model):
>>> multifield = ByteSplitterField(subfield_names=[0, 1, 'x'], subfield_lengths=[3, 4, 5],
>>> subfield_decimal_places=[0, 1, 2], round=round) #will be represented as 'smallint'
>>> otherfield = ByteSplitterField(subfield_lengths=13, subfield_decimal_places=3,
>>> overflow='truncate') # single field that saves number 0, 0.125, ... 1023.875
>>> # set the fields on 'instance', which is an instance of MyModel()
>>> instance.multifield = {0: 3, 1: 2.3, 'x': 6.78} #If you'd omit a subfield, it will be zero
>>> instance.otherfield = 2000 # you shouldn't do that, because...
>>> instance.otherfield # accessing the attribute again will return {'value': 250}
>>> instance.otherfield['value'] = 2000 # this is the correct way
>>> instance.save()
>>> # after you fetch it again from the db, you can access the fields
>>> instance.multifield # will return {0: 3.0, 1: 2.5, 'x': 6.75}
>>> # notice that key 1 would be 2.0 for round=int
>>> instance.multifield[0] # will return 3
>>> instance.otherfield['value'] # will return 1023.875 because of truncation
"""
__metaclass__ = models.SubfieldBase
def __init__(self, *args, **kwargs):
self.subfield_names = kwargs.pop('subfield_names', ['value'])
if not hasattr(self.subfield_names, '__iter__'):
self.subfield_names = [self.subfield_names]
self.subfield_lengths = kwargs.pop('subfield_lengths', 64)
if not hasattr(self.subfield_lengths, '__iter__'):
self.subfield_lengths = [self.subfield_lengths]
for length in self.subfield_lengths:
assert type(length) is int, "Please provide 'subfield_lengths': an iterable of type(int): %s" %self.subfield_lengths
assert length >= 1, "Length of each subfield must be >= 1 bit, but is %d bits" %length
self.subfield_decimal_places = kwargs.pop('subfield_decimal_places', None)
if self.subfield_decimal_places is None:
self.subfield_decimal_places = [0 for i in range(len(self.subfield_lengths))]
if not hasattr(self.subfield_decimal_places, '__iter__'):
self.subfield_decimal_places = [self.subfield_decimal_places]
for dec in self.subfield_decimal_places:
assert type(dec) is int, "'subfield_decimal_places' must be an iterable of type(int): %s" %self.subfield_decimal_places
assert len(self.subfield_names) == len(self.subfield_lengths) == len(self.subfield_decimal_places),\
"'subfield_lengths' must have the same length as 'subfield_names' (and also 'subfield_decimal_places' if you pass this keyword)"
self.n_bits = reduce(lambda x, y: x+y, self.subfield_lengths) #required length in bits
assert self.n_bits <= 64,\
"Sorry, but currently a maximum of 64 bit is supported (stored as 'bigint' on the db-backend), you requested a total of %d bits" %self.n_bits
self.round = kwargs.pop('round', int)
assert self.round in (int, round), "Please provide the built-in function <int> or <round> with the keyword 'round', not: %s" %self.round
if self.round == round:
self.round = lambda x: int(round(x))
self.overflow = kwargs.pop('overflow', 'error')
assert self.overflow in ('error', 'overflow', 'trunc', 'truncate'),\
"'overflow' must be set to 'error', 'overflow' or 'truncate', not '%s'" % self.overflow
super(ByteSplitterField, self).__init__(*args, **kwargs)
def db_type(self, connection):
if self.n_bits <= 8 and connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
return "tinyint unsigned"
elif self.n_bits <= 16:
return connection.creation.data_types['PositiveSmallIntegerField']
elif self.n_bits <= 24 and connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
return "mediumint unsigned"
elif self.n_bits <= 32:
return connection.creation.data_types['PositiveIntegerField']
# since django doesn't know a PositiveBigIntegerField, the db-representation is made manually (only tested for MySQL!)
elif self.n_bits <= 64 and connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
return "bigint unsigned"
elif self.n_bits <= 64 and connection.settings_dict['ENGINE'] == 'django.db.backends.oracle':
return "NUMBER(19) CHECK (%(qn_column)s >= 0)"
elif self.n_bits <= 64 and connection.settings_dict['ENGINE'] == 'django.db.backends.postgresql':
return 'bigint CHECK ("%(column)s" >= 0)'
elif self.n_bits <= 64 and connection.settings_dict['ENGINE'] == 'django.db.backends.sqlite3':
return "bigint unsigned"
def to_python(self, value):
"""
Returns a dict of {'subfield_name': subfield_value, ...}
"""
if value == None:
return None
#if the value is a string-representation, try to evaluate the string
if type(value) in (str, unicode):
try:
value = eval(value)
except:
raise exceptions.ValidationError(self.error_messages['invalid'])
#if value is a dict, check if it fits the model (i.e. contains all subfield_names and has values of correct type)
if type(value) is dict:
for k, v in value.items():
if k not in self.subfield_names:
raise exceptions.ValidationError("This is not a valid subfield_name: '%s'" %k)
if type(v) not in (int, float, decimal.Decimal):
raise exceptions.ValidationError("subfield_name '%s' must be of type int, float or Decimal, but is %s" %(k, type(v)))
return value #valid dict is directly returned
#now the value must be convertable to int (comes from db or from python via model's __setattr__,
#which is something that shouldn't be done (see field's description) but sadly can't be distinguished here)
try:
value = int(value)
except (TypeError, ValueError):
raise exceptions.ValidationError(self.error_messages['invalid'])
#standard-case: value is an int from db: process it
result = {}
for i in range(len(self.subfield_names) -1, -1, -1): #reverse range
result[self.subfield_names[i]] = (value % (2 ** self.subfield_lengths[i])) / (2 ** self.subfield_decimal_places[i])
value >>= self.subfield_lengths[i]
return result
def get_prep_value(self, value):
if type(value) <> dict:
return None
for k in value.keys():
if k not in self.subfield_names:
raise exceptions.ValidationError("This subfield_name doesn't exist: %s. Choices are %s" % (k, value.keys()))
result = 0
for i in range(len(self.subfield_names)):
result <<= self.subfield_lengths[i]
number = value.get(self.subfield_names[i], None)
if number <> None:
value_db = self.round(number * (2 ** self.subfield_decimal_places[i]))
value_db_max = (2 ** self.subfield_lengths[i] - 1)
if value_db > value_db_max or value_db < 0:
if self.overflow == 'error':
raise exceptions.ValidationError(
"The value %(value)f for field '%(field)s' (rounded to %(value_round)f) is out of specified range: [0, %(range)f]"
%{'value': value[self.subfield_names[i]], 'value_round': value_db / (2 ** self.subfield_decimal_places[i]),
'field': self.subfield_names[i], 'range': value_db_max / (2 ** self.subfield_decimal_places[i])})
elif self.overflow in ('truncate', 'trunc'):
result += max(0, min(value_db_max, value_db))
elif self.overflow == 'overflow':
value_db &= (2 ** self.subfield_lengths[i] - 1)
result += value_db
else:
result += value_db
return result
|
More like this
- Template tag - list punctuation for a list of items by shapiromatron 11 months, 2 weeks ago
- JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 11 months, 3 weeks ago
- Serializer factory with Django Rest Framework by julio 1 year, 6 months ago
- Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 7 months ago
- Help text hyperlinks by sa2812 1 year, 7 months ago
Comments
Please login first before commenting.