Login

Sorl Thumbnail + Amazon S3

Author:
skoczen
Posted:
June 10, 2009
Language:
Python
Version:
1.0
Score:
3 (after 3 ratings)

General notes:

  • Set MEDIA_URL (or whatever you use for uploaded content to point to S3 (ie. MEDIA_URL = "http://s3.amazonaws.com/MyBucket/"))

  • Put django-storage in project_root/libraries, or change the paths to make you happy.

  • This uses the functionality of django-storage, but not as DEFAULT_FILE_STORAGE.

The functionality works like so:

Getting stuff to S3

  • On file upload of a noted model, a copy of the uploaded file is saved to S3.

  • On any thumbnail generation, a copy is also saved to S3.

On a page load:

  1. We check to see if the thumbnail exists locally. If so, we assume it's been sent to S3 and move on.

  2. If it's missing, we check to see if S3 has a copy. If so, we download it and move on.

  3. If the thumb is missing, we check to see if the source image exists. If so, we make a new thumb (which uploads itself to S3), and move on.

  4. If the source is also missing, we see if it's on S3, and if so, get it, thumb it, and push the thumb back up, and move on.

  5. If all of that fails, somebody deleted the image, or things have gone fubar'd.

Advantages:

  • Thumbs are checked locally, so everything after the initial creation is very fast.

  • You can clear out local files to save disk space on the server (one assumes you needed S3 for a reason), and trust that only the thumbs should ever be downloaded.

  • If you want to be really clever, you can delete the original source files, and zero-byte the thumbs. This means very little space cost, and everything still works.

  • If you're not actually low on disk space, Sorl Thumbnail keeps working just like it did, except your content is served by S3.

Problems:

  • My python-fu is not as strong as those who wrote Sorl Thumbnail. I did tweak their code. Something may be wonky. YMMV.

  • The relative_source property is a hack, and if the first 7 characters of the filename are repeated somewhere, step 4 above will fail.

  • Upload is slow, and the first thumbnailing is slow, because we wait for the transfers to S3 to complete. This isn't django-storage, so things do genuinely take longer.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
# sorl/thumbnail/base.py:
def generate(self):
    """
    Generates the thumbnail if it doesn't exist or if the file date of the
    source file is newer than that of the thumbnail.
    """
    # Ensure dest(ination) attribute is set
    if not self.dest:
        raise ThumbnailException("No destination filename set.")
    
    new_generated = False
    if not isinstance(self.dest, basestring):
        # We'll assume dest is a file-like instance if it exists but isn't
        # a string.
        self._do_generate()
        new_generated = True
        
    elif not isfile(self.dest) or (self.source_exists and
        getmtime(self.source) > getmtime(self.dest)):

        import events.s3 as s3_events
        if s3_events.is_on_s3(self.relative_dest):
             # "thumb is on s3"
            s3_events.pull_from_s3(self.relative_dest)
            self._source_exists = True
        else:
             # "thumb not on s3"
            if not self.source_exists:
                # file's missing.
                if s3_events.is_on_s3(self.relative_source):
                    s3_events.pull_from_s3(self.relative_source)
                    self._source_exists = True
                else:
                     # "source is not on S3!"
                    self._source_exists = False

            if self.source_exists:
                # Ensure the directory exists
                directory = dirname(self.dest)
                if not isdir(directory):
                    os.makedirs(directory)

                self._do_generate()
                new_generated = True

    if new_generated:
        s3_events.push_to_s3(self.relative_dest)

def _get_relative_source(self):
    # Hack.
    try:
        start_str = self.relative_dest[:7]
        return self.source[self.source.find(start_str):]
    except:
        return self.source
relative_source = property(_get_relative_source)




# events/s3.py
from django.conf import settings
import libraries.backends.s3 as s3

def push_to_s3(file_path):
    s3_storage = s3.S3Storage()
    img_file = open("%s%s" % (settings.MEDIA_ROOT,file_path),'r')
    s3_img_file = s3_storage.open("%s" % (file_path), 'w')
    s3_img_file.write(img_file.read())
    img_file.close()
    s3_img_file.close()

def is_on_s3(file_path):
    s3_storage = s3.S3Storage() 
    return s3_storage.exists(file_path)
    
def pull_from_s3(file_path):
    s3_storage = s3.S3Storage()     
    img_file = open("%s%s" % (settings.MEDIA_ROOT,file_path),'w')
    s3_img_file = s3_storage.open(file_path, 'r')
    img_file.write(s3_img_file.read())
    s3_img_file.close()
    img_file.close()


# models.py
class Screenshot(SixLinksModel):
    shot = models.ImageField("Screenshot",upload_to="screenshots")

    def save(self):
        super(Screenshot, self).save()      
        import events.s3 as s3_events
        s3_events.push_to_s3(self.shot)

    def __unicode__(self):
        return "%s" % (self.shot)


# assumes django-storage is sitting in libraries, e.g. libraries/backends/s3.py is a file

# settings.py
AWS_ACCESS_KEY_ID = "YOUR-KEY"
AWS_SECRET_ACCESS_KEY = "YOUR-SECRET-KEY"
AWS_STORAGE_BUCKET_NAME = "YOUR-BUCKET"
from S3 import CallingFormat
AWS_CALLING_FORMAT = CallingFormat.PATH

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 10 months, 2 weeks ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 10 months, 3 weeks ago
  3. Serializer factory with Django Rest Framework by julio 1 year, 5 months ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 6 months ago
  5. Help text hyperlinks by sa2812 1 year, 6 months ago

Comments

Please login first before commenting.