Login

Overwriting file storage

Author:
rbanffy
Posted:
May 31, 2010
Language:
Python
Version:
1.2
Score:
0 (after 2 ratings)

Useful if you can make sure your files will never collide unless they share the same contents.

You point to the storage parameter when you declare an ImageField and use the hashed_path class method (seemed to make sense to bundle it with the class) to build a custom upload_to method that has to generate the path based on the file contents.

Combining both will allow you to deduplicate files uploaded repeatedly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
from django.core.files.storage import FileSystemStorage
from django.core.files.storage import Storage
from django.conf import settings
from django.core.files import locks, File

import hashlib
import os
import errno

# A handy base-conversion thingie
from base_n_utils import base_n_encode

class OverwritingStorage(FileSystemStorage):
    """
    FileSystemStorage that allows overwriting files
    """
    def get_available_name(self, name):
        return name

    def _save(self, name, content):
        """
        lifted from django/core/files/storage.py
        """
        full_path = self.path(name)

        directory = os.path.dirname(full_path)
        if not os.path.exists(directory):
            os.makedirs(directory)
        elif not os.path.isdir(directory):
            raise IOError("%s exists and is not a directory." % directory)

        # This file has a file path that we can move.
        if hasattr(content, 'temporary_file_path'):
            file_move_safe(content.temporary_file_path(), full_path)
            content.close()
            
        # This is a normal uploadedfile that we can stream.
        else:
            # This fun binary flag incantation makes os.open throw an
            # OSError if the file already exists before we open it.
            try:
                fd = os.open(full_path, os.O_WRONLY | os.O_CREAT | os.O_EXCL | getattr(os, 'O_BINARY', 0))
                locks.lock(fd, locks.LOCK_EX)
                for chunk in content.chunks():
                    os.write(fd, chunk)
                locks.unlock(fd)
                os.close(fd)
            except OSError, e:
                if e.errno != errno.EEXIST:
                    raise
                else:
                    # TODO: Should check if files are identical and shout if they are not
                    pass

        if settings.FILE_UPLOAD_PERMISSIONS is not None:
            os.chmod(full_path, settings.FILE_UPLOAD_PERMISSIONS)

        return name

    def delete(self, name):
        """
        Never deletes anything

        TODO: Could conceivably check if any ImageFields in the application are pointing to it
        """
        pass

    @classmethod
    def hashed_path(cls, image):
        m = hashlib.md5()
        m.update(image.read())
        num = int(m.hexdigest(), 16)
        md5_36 = base_n_encode(num, '0123456789abcdefghijklmnopqrstuvwxyz')
        pathname = settings.MEDIA_ROOT + '/images/' + md5_36[:2] + '/' + md5_36[2:4] + '/' + md5_36[4:6] + '/' + md5_36 + '.jpg'
        return pathname

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 3 months, 1 week ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 3 months, 2 weeks ago
  3. Serializer factory with Django Rest Framework by julio 10 months, 2 weeks ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 11 months ago
  5. Help text hyperlinks by sa2812 12 months ago

Comments

Please login first before commenting.