Login

Overwriting file storage

Author:
rbanffy
Posted:
May 31, 2010
Language:
Python
Version:
1.2
Tags:
file storage
Score:
0 (after 2 ratings)

Useful if you can make sure your files will never collide unless they share the same contents.

You point to the storage parameter when you declare an ImageField and use the hashed_path class method (seemed to make sense to bundle it with the class) to build a custom upload_to method that has to generate the path based on the file contents.

Combining both will allow you to deduplicate files uploaded repeatedly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
from django.core.files.storage import FileSystemStorage
from django.core.files.storage import Storage
from django.conf import settings
from django.core.files import locks, File

import hashlib
import os
import errno

# A handy base-conversion thingie
from base_n_utils import base_n_encode

class OverwritingStorage(FileSystemStorage):
    """
    FileSystemStorage that allows overwriting files
    """
    def get_available_name(self, name):
        return name

    def _save(self, name, content):
        """
        lifted from django/core/files/storage.py
        """
        full_path = self.path(name)

        directory = os.path.dirname(full_path)
        if not os.path.exists(directory):
            os.makedirs(directory)
        elif not os.path.isdir(directory):
            raise IOError("%s exists and is not a directory." % directory)

        # This file has a file path that we can move.
        if hasattr(content, 'temporary_file_path'):
            file_move_safe(content.temporary_file_path(), full_path)
            content.close()
            
        # This is a normal uploadedfile that we can stream.
        else:
            # This fun binary flag incantation makes os.open throw an
            # OSError if the file already exists before we open it.
            try:
                fd = os.open(full_path, os.O_WRONLY | os.O_CREAT | os.O_EXCL | getattr(os, 'O_BINARY', 0))
                locks.lock(fd, locks.LOCK_EX)
                for chunk in content.chunks():
                    os.write(fd, chunk)
                locks.unlock(fd)
                os.close(fd)
            except OSError, e:
                if e.errno != errno.EEXIST:
                    raise
                else:
                    # TODO: Should check if files are identical and shout if they are not
                    pass

        if settings.FILE_UPLOAD_PERMISSIONS is not None:
            os.chmod(full_path, settings.FILE_UPLOAD_PERMISSIONS)

        return name

    def delete(self, name):
        """
        Never deletes anything

        TODO: Could conceivably check if any ImageFields in the application are pointing to it
        """
        pass

    @classmethod
    def hashed_path(cls, image):
        m = hashlib.md5()
        m.update(image.read())
        num = int(m.hexdigest(), 16)
        md5_36 = base_n_encode(num, '0123456789abcdefghijklmnopqrstuvwxyz')
        pathname = settings.MEDIA_ROOT + '/images/' + md5_36[:2] + '/' + md5_36[2:4] + '/' + md5_36[4:6] + '/' + md5_36 + '.jpg'
        return pathname

More like this

  1. File storage with a better rename method by SmileyChris 6 years, 1 month ago
  2. Database file storage by powerfox 6 years, 7 months ago
  3. TemplatedForm by alexkoval 8 years, 5 months ago
  4. Coffeescript compilation by delfick 2 years, 7 months ago
  5. Custom collectstatic that uses etag and md5 digests to determine whether files on S3 have changed by millarm 2 years, 6 months ago

Comments

Please login first before commenting.