Login

Tag "nginx"

Snippet List

Django nginx sendfile example

Use nginx sendfile (X-Accel-Redirect) function to serve files but pass traffic through django. Can be used to serve media files only to logged-in users.

  • django
  • media
  • sendfile
  • nginx
Read More

Effective content caching for mass-load site using redirect feature

Hi All, I would like to share my idea and experience in using http redirect feature to organize effective content caching and avoid site blocking on mass requests to not yet cached content. First of all, I should explain that I don't mean usual template-based pages, whose caching in django is enough in almost all cases. My speech is concerning to content such as dynamically created images, external files and other content which may require almost unpredictable long time period to be ready, up to about 10 seconds. Anyway, the content which is already in cache, can be returned immediately. The question is - what *should* happen if the requested content is *not yet present* in the cache? The HTML content is not a problem in this case - we can return intermediate page with meta information to ask a client to retry the content after some period of time. But non-HTML content, such as image, or json, is a problem. The usual *blocking* behavior of the django caching subsystem is not appropriate in a case when the content preparation process takes *more than several fractions* of a second. The whole period while the content is preparing (painting, getting from the external resource and so on) *hundreds* of users will *ask not yet cached* content until all available application server connections will be *occupied* by waiting requests and new user will wait for the content even if the content requested by him is actually ready to be returned. The celery package resolves this problem *partially*, allowing to schedule cache update before cache expiration in asynchronous manner (I am using threadpool package for this purpose instead), and avoiding blocking on content preparation procedure, but the problem with not yet available content remains unresolved. We might schedule periodic cache update to avoid such a situation, but this way leads to unnecessary resource load and may be inappropriate at all in some cases, f.e. for the service like CloudMade or Google "staticmap" painting map content basing on the request parameters. My solution is *using HTTP Redirect feature* . Ideal solution would be available if most of HTTP clients supported Retry-After header in the HTTP Redirect response. Unfortunately, no one of most popular clients (neither Mozilla, nor IE, nor Chrome) does it. Really they ask redirected URL immediately after receiving the redirection response, and ignore Retry-After header at all. Anyway, I hope that they *will* support this feature, but this is not enough right now. So, instead of delegating waiting job to the client side, we need to organize waiting on the server. For this purpose I have created small separate application server with it's *own connection pool*, and delegated waiting task to him (see the snippet). This server needs one URL (/retry_after in the example) looking to the view above, installed in the url.py, and additional DEFAULT_RETRY_TIMEOUT parameter in the settings.py. The view uses request GET parameters, so Django will not try to cache it's results. The only task for this server is to *suspend* redirected requests until requested time is *really arrived*, and then redirect the request to the original URL. To make possible to process many requests simultaneously, it breaks waiting and redirects too long time requests to the self basing on the default timeout. Using this server is simple. We will instantiate the 'retry' server on the separate URL as a *separate application server* and will redirect all requests required to be delayed to this server, passing original URL and the time moment when the requested content should be ready, as parameters. The main server which prepares and returns the content, is *never waiting*. It returns prepared content from the cache, if it is found there, or *starts* content preparation process in *asynchronous* manner using threads, celery package, or any other appropriate technique, and *immediately redirects* the request to the 'retry' server The following is a fragment of the code using 'retry' server: ... # Create your views here. def jams(request,z,x,y): jams_timeout = settings.JAMS_TIMEOUT retry_after_server = settings.RETRY_AFTER_SERVER try: jams_ts,jams_png = request_jams(z,x,y) except Http404,ex: return HttpResponseNotFound() if not jams_png: if settings.DEBUG: logger.debug("JAMS NOT YET READY FOR: %s" % str((z,x,y))) # Tell a client to retrieve jams later time_after = time.time() + settings.DEFAULT_RETRY_TIMEOUT if settings.DEBUG: logger.debug("SEND RETRY AFTER FOR: %s" % str((z,x,y))) loc = reverse(jams,args=(z,x,y)) if request.get_host(): loc = request.build_absolute_uri(loc) r = HttpResponseRedirect( retry_after_server + '?' + urlencode({'redirect': loc,'retry_after':int(time_after)}) ) r['Retry-After'] = settings.DEFAULT_RETRY_TIMEOUT return r # return a png got from the cache if settings.DEBUG: logger.debug("RETURNING JAMS FOR: %s" % str((z,x,y,http_date(jams_ts),http_date(time.time())))) r = HttpResponse(jams_png) r['Content-Type'] = "image/png" r['Content-Length'] = len(jams_png) r['Content-Location'] = request.path # extra data for cache negotiation # Common and ConditionalGet Django Middleware will do the rest r['ETag'] = '"%s"' % md5_constructor(jams_png).hexdigest() r['Last-Modified'] = http_date(jams_ts) r['Expires'] = http_date(jams_ts + jams_timeout) return r ... At the application level, the request_jams() call tries to get content from the cache, and starts content update process asynchronously if required. It returns content got from the cache, and timestamp when the content has been created last time, or pair of None,None values, if the requested content is not available immediately. The line starting from 'r = HttpResponseRedirect' redirects the request to the 'retry' server, if the content is not available immediately. I am using nginx as a 'front-end' proxy server which passes requests to two separate fastcgi daemons - 'retry' and main django application servers separating requests by prefix. The following is a fragment of the nginx.conf configuration file for the nginx server: server { ... location /jams { fastcgi_pass unix:/var/tmp/jams.sock; ... } location /retry_after { fastcgi_pass unix:/var/tmp/retry.sock; ... } ... } As you can see, two locations are passed to *different* application server instances in the production environment, so they *don't block each other*. Really they both are implemented inside one django project, and started from the same directory as fastcgi daemon instances with different starting parameters (sockets to listen, log files etc.). I've used such structure to configure and debug my server easy, either just starting one developer server for both locations, or to have a possibility to start production server in described 'split' mode on the other hand. You can see that the both servers (jams and retry) use the same DEFAULT_RETRY_TIMEOUT setting from the settings.py file. The DEFAULT_RETRY_TIMEOUT plays role of load balancing parameter for requests which wait finishing asynchronous job. After the retry timeout is passed, the control returns to the client by the redirection response (or, for smart clients, the client itself waits the retry timeout before requesting the resource again), so the server may cleanup resources temporary acquired by these requests. Additional bonus of using such technique is stability of server processor load. Even in prime time, the server load is stable and controlled: number of the application instances and number of threads in the thread pool (evaluating asynchronous jobs) for each instance is fixed by configuration, nginx load per request is minimal, and concurrent requests are balanced (between server, network, and clients) by the retry timeout and redirect responses. You can see the result of the work on the our site, f.e. [on this page](http://www.doroga.tv/nnov/apps/map/), where all (transparent version) jam tiles are painting such a manner. The only difference for the production server environment is that some top zooms of most popular regions are refreshed by the special daemon (also written using django as a 'management command') which forces tiles to be refreshed in the cache even no one client requests it right now — to decrease response time for the jams map after long 'sleep' period.

  • cache
  • load
  • redirect
  • nginx
  • content
  • mass
Read More

django on tornado

there have been many posts on running django on tornado with static media served by nginx. But for dumb people like me, the whole thing needs to be spelt out. So here is how I succeeded in serving django from a virtual host using nginx and tornado. The key thing to note is that 'root' refers to the **parent** directory of the root and not the full path. Also remember to put in ':' as a line end. Procedure - start the tornado server with the python script on localhost:8888, start nginx. Relax and enjoy your django at the speed of light. Nginx can be got by apt-get or yum, but you need the latest git clone of Tornado - the default tarball does not support django. btw, this install is for FC11 on my laptop - I have done it in production on lenny.

  • nginx
  • tornado
Read More

minimal nginx conf to split get/post requests

After a point the sql server becomes the bottleneck in lots of web application, and to scale, master-slave replication with single master, multiple slave is recommended. This setup with nginx can be used to accomplish traffic distribution between master and slave based on request method.

  • middleware
  • nginx
  • loadbalancing
  • master-slave
Read More

nginx x-accel-redirect protection of static files

This snippet requires nginx as your front end server (for serving static files) and any django enabled server as a backend that only gets the dynamic requests (I use apache with mod_python). If you have no idea what I'm talking about, you probably won't need this snippet. I previously tried something [similar](http://www.djangosnippets.org/snippets/62/) just using mod_python, but this was too unstable for my needs (the PythonAuthenHandler seems to be called multiple times for no apparent reason). The patch from that snippet was also used as a base for [this ticket](http://code.djangoproject.com/ticket/3583). This is part of an authentication mechanism I use for protecting static files. Nginx has the so called x-accel-redirect feature, that tells nginx to serve an internal (read 'protected') file if the backend response has the ['X-Accel-Redirect'] header set. No other headers are touched, but by deleting all relevant headers in the default django response, nginx will create those headers for us. The usage is pretty simple: * set up nginx as a proxy for apache + mod_python + django (google for this if you don't know how) * configure nginx as shown in the code snippet (for testing leave out the internal part to see if the files are accessible) * configure your urls.py to point to the validation view * make your sites hrefs point to the view instead of the file directly (those real urls will be completely hidden from your visitors) * Done!

  • authentication
  • nginx
Read More

5 snippets posted so far.