While checking up on some cronjobs at [YouTellMe](http://www.youtellme.nl/) we had some problems with large cronjobs that took way too much memory. Since Django normally loads all objects into it's memory when iterating over a queryset (even with .iterator, although in that case it's not Django holding it in it's memory, but your database client) I needed a solution that chunks the querysets so they're only keeping a small subset in memory.
Example on how to use it:
`my_queryset = queryset_iterator(MyItem.objects.all())
for item in my_queryset:
item.do_something()`
[More info on my blog](http://www.mellowmorning.com/2010/03/03/django-query-set-iterator-for-really-large-querysets/)
The popular [django-tagging](http://code.google.com/p/django-tagging/) app has, in its implementation and semantics, a highly usable and transparent elegance -- but then you have to call methods on a Tag instances' items collection. These classes let you inline the tag name in the chain of queryset filter methods instead.
TO USE:
### models.py
...
from tagging.fields import TagField
from tagging.models import Tag as Tag
class YourModel(models.Model):
...
yourtags = TagField()
objects = TaggedManager()
...
### and then elsewhere, something like--
...
ym = YourModel.objects.order_by("-modifydate")[0]
anotherym = YourModel.objects.get(id=7) ## distinct from ym
ym.yourtags = "tag1 tag2"
anotherym.yourtags = "tag1 othertag"
ym.save()
anotherym.save()
with_tag1 = YourModel.objects.tagged('tag1')
with_tag2 = YourModel.objects.tagged('tag2').order_by('-modifydate')
print ym in with_tag1 ## True
print anotherym in with_tag1 ## True
print ym in with_tag2 ## False
... since these are QuerySets, you can easily create unions (e.g. `with_tag1 | with_tag2` and othersuch) as you need and filter them to your hearts' content, without having to instantiate Tag all the time (which you can of course do as well).
This is an upgrade of snippet [1103](http://www.djangosnippets.org/snippets/1103/).
Exemplary usage:
class Blog(models.Model):
name = models.CharField(max_length=100)
def __unicode__(self):
return self.name
class Post(models.Model):
title = models.CharField(max_length=50)
blog = models.ForeignKey(Blog)
def __unicode__(self):
return self.title
class Meta:
abstract=True
class Article(Post):
text = models.TextField()
class Link(Post):
url = models.URLField()
blog = Blog(name="Exemplary blog")
blog.save()
Article(title="#1", text="Exemplary article 1", blog=blog).save()
Article(title="#2", text="Exemplary article 2", blog=blog).save()
Link(title="#3", url="http://exemplary.link.com/", blog=blog).save()
qs1 = Article.objects.all()
qs2 = Link.objects.all()
qsseq = QuerySetSequence(qs1, qs2)
# those all work also on IableSequence
len(qsseq)
len(QuerySetSequence(qs2, qs2))
qsseq[len(qs1)].title
# this is QuerySetSequence specific
qsseq.order_by('blog.name','-title')
excluded_homo = qsseq.exclude(title__contains="3")
# homogenic results - returns QuerySet
type(excluded_homo)
excluded_hetero = qsseq.exclude(title="#2")
# heterogenic results - returns QuerySetSequence
type(excluded_hetero)
excluded_hetero.exists()
You can implement more `QuerySet` API methods if needed. If full API is implemented it makes sense to also subclass the `QuerySet` class.
We needed to override the default QuerySet delete function to deal with a client problem that we were facing
Yes This is monkey-patching, and probably bad practice but if anyone needs to conditionally override the cascading delete that django does at the application level from a queryset, this is how to do it
If you have a model with foreign key to User, you can use this manager to show (i.e. in admin interface) only objects, that are related to currently logged-in user. Superuser sees all objects, not only his.
Requires: [ThreadlocalsMiddleware](http://code.djangoproject.com/wiki/CookBookThreadlocalsAndUser)
Install [sqlparse](http://code.google.com/p/python-sqlparse/) with `easy_install sqlparse` and then you can easily debug the SQL like this:
def view(request):
data = MyModel.objects.filter(something__very=complex)
print_sql(data)
...
Inspired by [Simon Willison](http://simonwillison.net/2009/Apr/28/)
Call a function for each element in a queryset (actually, any list).
Features:
* stable memory usage (thanks to Django paginators)
* progress indicators
* wraps batches in transactions
* can take managers or even models (e.g., `Assertion.objects`)
* warns about `DEBUG`.
* handles failures of single items without dying in general.
* stable even if items are added or removed during processing (gets a list of ids at the start)
Returns a `Status` object, with the following interesting attributes
* `total`: number of items in the queryset
* `num_successful`: count of successful items
* `failed_ids`: list of ids of items that failed
Example view code:
lazy_field_options = {
'field_name_that_is_m2m': {
'queryset': YourRelatedModel.objects.filter(groups=request.user.groups.all()),
},
'field_name_that_is_fk': {
'queryset': YourOtherRelatedModel.objects.filter(slug=request_slug),
},
}
modelform = YourModelForm(jpic_field_options=lazy_field_options)
# after the modelform has called for parent __init__, it will set
# options for each field if possible.
My modified version of the [MultiQuerySet by mattdw](http://www.djangosnippets.org/snippets/1103/) (see the link for further information).
My purpose for this was to enable me to combine multiple different types of querysets together, which could then be iterated on as one object (i.e. like a tumblelog).