Effective content caching for mass-load site using redirect feature

Author:: nnseva
Posted:: July 7, 2011
Language:: Python
Version:: 1.3
Score:: 2 (after 2 ratings)

Download
Raw

Hi All,

I would like to share my idea and experience in using http redirect feature to organize effective content caching and avoid site blocking on mass requests to not yet cached content.

First of all, I should explain that I don't mean usual template-based pages, whose caching in django is enough in almost all cases. My speech is concerning to content such as dynamically created images, external files and other content which may require almost unpredictable long time period to be ready, up to about 10 seconds.

Anyway, the content which is already in cache, can be returned immediately. The question is - what should happen if the requested content is not yet present in the cache?

The HTML content is not a problem in this case - we can return intermediate page with meta information to ask a client to retry the content after some period of time. But non-HTML content, such as image, or json, is a problem.

The usual blocking behavior of the django caching subsystem is not appropriate in a case when the content preparation process takes more than several fractions of a second. The whole period while the content is preparing (painting, getting from the external resource and so on) hundreds of users will ask not yet cached content until all available application server connections will be occupied by waiting requests and new user will wait for the content even if the content requested by him is actually ready to be returned.

The celery package resolves this problem partially, allowing to schedule cache update before cache expiration in asynchronous manner (I am using threadpool package for this purpose instead), and avoiding blocking on content preparation procedure, but the problem with not yet available content remains unresolved. We might schedule periodic cache update to avoid such a situation, but this way leads to unnecessary resource load and may be inappropriate at all in some cases, f.e. for the service like CloudMade or Google "staticmap" painting map content basing on the request parameters.

My solution is using HTTP Redirect feature .

Ideal solution would be available if most of HTTP clients supported Retry-After header in the HTTP Redirect response. Unfortunately, no one of most popular clients (neither Mozilla, nor IE, nor Chrome) does it. Really they ask redirected URL immediately after receiving the redirection response, and ignore Retry-After header at all. Anyway, I hope that they will support this feature, but this is not enough right now.

So, instead of delegating waiting job to the client side, we need to organize waiting on the server. For this purpose I have created small separate application server with it's own connection pool, and delegated waiting task to him (see the snippet).

This server needs one URL (/retry_after in the example) looking to the view above, installed in the url.py, and additional DEFAULT_RETRY_TIMEOUT parameter in the settings.py. The view uses request GET parameters, so Django will not try to cache it's results.

The only task for this server is to suspend redirected requests until requested time is really arrived, and then redirect the request to the original URL. To make possible to process many requests simultaneously, it breaks waiting and redirects too long time requests to the self basing on the default timeout.

Using this server is simple. We will instantiate the 'retry' server on the separate URL as a separate application server and will redirect all requests required to be delayed to this server, passing original URL and the time moment when the requested content should be ready, as parameters.

The main server which prepares and returns the content, is never waiting. It returns prepared content from the cache, if it is found there, or starts content preparation process in asynchronous manner using threads, celery package, or any other appropriate technique, and immediately redirects the request to the 'retry' server

The following is a fragment of the code using 'retry' server:

...
# Create your views here.
def jams(request,z,x,y):
    jams_timeout = settings.JAMS_TIMEOUT
    retry_after_server = settings.RETRY_AFTER_SERVER
    try:
        jams_ts,jams_png = request_jams(z,x,y)
    except Http404,ex:
        return HttpResponseNotFound()
    if not jams_png:
        if settings.DEBUG: logger.debug("JAMS NOT YET READY FOR: %s" % str((z,x,y)))
        # Tell a client to retrieve jams later
        time_after = time.time() + settings.DEFAULT_RETRY_TIMEOUT
        if settings.DEBUG: logger.debug("SEND RETRY AFTER FOR: %s" % str((z,x,y)))
        loc = reverse(jams,args=(z,x,y))
        if request.get_host():
            loc = request.build_absolute_uri(loc)
        r = HttpResponseRedirect( retry_after_server + '?' + urlencode({'redirect': loc,'retry_after':int(time_after)}) )
        r['Retry-After'] = settings.DEFAULT_RETRY_TIMEOUT
        return r
    # return a png got from the cache
    if settings.DEBUG: logger.debug("RETURNING JAMS FOR: %s" % str((z,x,y,http_date(jams_ts),http_date(time.time()))))
    r = HttpResponse(jams_png)
    r['Content-Type'] = "image/png"
    r['Content-Length'] = len(jams_png)
    r['Content-Location'] = request.path

    # extra data for cache negotiation
    # Common and ConditionalGet Django Middleware will do the rest
    r['ETag'] = '"%s"' % md5_constructor(jams_png).hexdigest()
    r['Last-Modified'] = http_date(jams_ts)
    r['Expires'] = http_date(jams_ts + jams_timeout)
    return r
...

At the application level, the request_jams() call tries to get content from the cache, and starts content update process asynchronously if required. It returns content got from the cache, and timestamp when the content has been created last time, or pair of None,None values, if the requested content is not available immediately.

The line starting from 'r = HttpResponseRedirect' redirects the request to the 'retry' server, if the content is not available immediately.

I am using nginx as a 'front-end' proxy server which passes requests to two separate fastcgi daemons - 'retry' and main django application servers separating requests by prefix. The following is a fragment of the nginx.conf configuration file for the nginx server:

server {
...
    location /jams {
        fastcgi_pass                        unix:/var/tmp/jams.sock;
    ...
    }
    location /retry_after {
        fastcgi_pass                        unix:/var/tmp/retry.sock;
    ...
    }
...
}

As you can see, two locations are passed to different application server instances in the production environment, so they don't block each other. Really they both are implemented inside one django project, and started from the same directory as fastcgi daemon instances with different starting parameters (sockets to listen, log files etc.).

I've used such structure to configure and debug my server easy, either just starting one developer server for both locations, or to have a possibility to start production server in described 'split' mode on the other hand. You can see that the both servers (jams and retry) use the same DEFAULT_RETRY_TIMEOUT setting from the settings.py file.

The DEFAULT_RETRY_TIMEOUT plays role of load balancing parameter for requests which wait finishing asynchronous job. After the retry timeout is passed, the control returns to the client by the redirection response (or, for smart clients, the client itself waits the retry timeout before requesting the resource again), so the server may cleanup resources temporary acquired by these requests.

Additional bonus of using such technique is stability of server processor load. Even in prime time, the server load is stable and controlled: number of the application instances and number of threads in the thread pool (evaluating asynchronous jobs) for each instance is fixed by configuration, nginx load per request is minimal, and concurrent requests are balanced (between server, network, and clients) by the retry timeout and redirect responses.

You can see the result of the work on the our site, f.e. on this page, where all (transparent version) jam tiles are painting such a manner. The only difference for the production server environment is that some top zooms of most popular regions are refreshed by the special daemon (also written using django as a 'management command') which forces tiles to be refreshed in the cache even no one client requests it right now — to decrease response time for the jams map after long 'sleep' period.

#retry/views.py file
from django.http import HttpResponse, HttpResponseRedirect, HttpResponseNotFound, HttpResponseBadRequest, Http404
from django.utils.http import urlencode
from django.conf import settings
from django.utils.encoding import iri_to_uri
import time
import logging
logger = logging.getLogger('retry')
# Create your views here.
def retry_after(request):
    default_timeout = settings.DEFAULT_RETRY_TIMEOUT
    if not 'redirect' in request.GET:
        return HttpResponseBadRequest('redirect parameter required')
    redirect = request.GET['redirect']
    time_retry = time.time() + default_timeout
    if 'retry_after' in request.GET:
        time_retry = int(float(request.GET['retry_after']))
    if settings.DEBUG: logger.debug("RETRY AFTER GOT FOR: %s" % redirect)
    time_now = time.time()
    if time_now < time_retry:
        if settings.DEBUG: logger.debug("RETRY AFTER IS FROM DUMB FOR: %s" % redirect)
        sleep_timeout = time_retry - time_now
        if sleep_timeout > default_timeout: sleep_timeout =  default_timeout
        # the client is dumb, we need to wait on the server side
        time.sleep(sleep_timeout)
    time_now = time.time()
    if time_now < time_retry:
        if settings.DEBUG: logger.debug("RESEND RETRY AFTER FOR: %s" % redirect)
        r = HttpResponseRedirect( iri_to_uri(request.path) + '?' + urlencode( {'retry_after':int(time_retry), 'redirect':redirect} ) )
        r['Retry-After'] = int(time_retry - time_now) + 1
        return r
    # Tell a client to retrieve original URL
    if settings.DEBUG: logger.debug("REDIRECT TO ORIGINAL URL FOR: %s" % redirect)
    r = HttpResponseRedirect( redirect )
    return r

Comments

Please login first before commenting.

Effective content caching for mass-load site using redirect feature

More like this

Comments