Login

Flickr Sync

Author:
bretwalker
Posted:
June 29, 2007
Language:
Python
Version:
.96
Tags:
flickr photos photo flicker
Score:
7 (after 7 ratings)

This code provides a Django model for photos based on Flickr, as well as a script to perform a one-way sync between Flickr and a Django installation. Please note that the snipped contains code for two files, update.py and a Django model. The two chunks are separated by:

"""
END OF FLICKRUPDATE
"""

"""
START DJANGO PHOTO MODEL
Requires django-tagging (http://code.google.com/p/django-tagging/)
"""

My model implements tagging in the form of the wonderful django-tagging app by Jonathan Buchanan, so be sure to install it before trying to use my model.

The flickrupdate.py code uses a modified version of flickerlib.py (http://code.google.com/p/flickrlib/). Flickr returns invalid XML occasionally, which Python won't stand for. I got around this by wrapping the return XML in <flickr_root> tags.

To modify flickrlib to work with my code, simply change the this line:

return self.parseData(getattr(self._serverProxy, '.'.join(n))(kwargs))

to:

return self.parseData('<flickr_root>' + getattr(self._serverProxy, '.'.join(n))(kwargs) + '</flickr_root>')

I hate this workaround, but I can't control what Flickr returns.

flickrupdate will hadle the addition and deletion of photos, sets and tags. It will also keep track of photos' pools, although, right now, it doesn't delete unused pools. This is mostly because I don't care about unused pools hanging around. It's a simple enough addition, so I'll probably add it when I have a need.

Be sure to set the appropriate information on these lines:

api_key = "YOUR API KEY"
api_secret = "YOUR FLICKR SECRET"
flickr_uid = 'YOUR FLICKR USER ID'

I hadn't seen a Django model and syncing script, so I threw these together. I hope they will be useful to those wanting start syncing their photos.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
"""
flickrupdate.py

Requires flickrlib.py (http://monotonous.org/2005/11/26/flickrlib-05/)

Syncs Django flickr model with flickr.
Sync is one way (flickr->django), as I didn't see any need to edit photos in Django.

Copyright (c) 2007, Bret Walker

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the Bret Walker nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""

from math import ceil
from datetime import datetime
import sys

from flickrlib import FlickrAgent

from app.photos.models import Photo
from app.photos.models import Set
from app.photos.models import Pool

api_key = "YOUR API KEY"
api_secret = "YOUR FLICKR SECRET"
flickr_uid = 'YOUR FLICKR USER ID'

agent = FlickrAgent(api_key, api_secret)

def update_sets():
    
    """Get sets from Flickr"""
    photo_sets = agent.flickr.photosets.getList()['photosets'][0]
    
    """Create a list of all set ids (flickr ids).  Items will be removed from the list if they're found to be in the local database"""
    sets_to_prune = []
    current_sets = Set.objects.values('set_id')
    for current_set in current_sets:
        sets_to_prune.append(int(current_set['set_id']))    
    
    """Iterate through photo sets returned by flickr"""
    for photo_set in photo_sets['photoset']:
        
        """Create a new set from the info given to us by flickr and save it to the database"""
        s = Set(set_id=int(photo_set['id']),
        primary=photo_set['primary'],
        secret=photo_set['secret'],
        server=int(photo_set['server']),
        farm=int(photo_set['farm']),
        title=photo_set['title'][0]['text'],
        description=photo_set['description'][0]['text'],
        photos=int(photo_set['photos']),
        )
        
        """Try to find a set in the database which has the same set id as given to us by flickr"""
        try:
            matching_set = Set.objects.get(set_id=photo_set['id'])
            """Found matching set, remove it from the list of sets to be removed from local database"""
            sets_to_prune.remove(int(photo_set['id']))
            
            """Iterate through each of the fields, looking for differences.  Skip the first (auto id) field, since it will always differ"""
            for k in s._meta.fields[1:]:
                """If a difference is found, go ahead and commit the s Set to the database and break out of this loop"""
                if str(getattr(s, k.name)) != str(getattr(matching_set, k.name)):
                    s.id = matching_set.id
                    s.save()
                    print 'Updated set ' + s.title + ' k: ' + k.name
                    break
                                
        except Set.DoesNotExist:
            """Exception thrown by above try meaning there is no record of an object with the set id give by flickr"""
            s.save()
            print 'Created set: ' + s.title
        
    
    """Iterate through the sets_to_prune list, deleting any remaining sets from the local database"""            
    for photo_set in sets_to_prune:
        s = Set.objects.get(set_id=photo_set)
        print 'Deleted set: ' + s.title
        s.delete()
    
def update_photos():
    
    """Empty dict to hold photo ids and update times"""
    photo_ids = {}
    """Set number to fetch, and whether to get additional photos if there are more than the number you're fetching.
    Max number to fetch is 500. """
    num_to_fetch = 500
    """This has to be true because Flicker doesn't provide a way to get a list of deleted photos.
    You have to get all photos to know which ones are in your Django dataqbase which are no longer in Flickr."""
    get_all = True
    
    photos_to_prune = []
    for p in Photo.objects.all():
        photos_to_prune.append(int(p.photo_id))
    
    """Search photos, getting all belonging to you"""
    photos = agent.flickr.photos.search(user_id=flickr_uid, per_page=num_to_fetch, extras="last_update")['photos'][0]
    
    """Store photo ids returned by search"""
    for photo in photos['photo']:
        photo_ids[photo['id']] = photo['lastupdate']
    
    """If get more is true, then loop through all photos"""
    if get_all:
        """Get the total number of photos that flickr reports"""
        total_photos = int(photos['total'])
        
        """Make sure there are more photos than the number we've already fetched"""
        if total_photos > num_to_fetch:
            """Loop, searching flickr for remaining photos"""
            while total_photos > len(photo_ids):
                photos = agent.flickr.photos.search(user_id=flickr_uid, per_page=num_to_fetch, page=(len(photo_ids)+num_to_fetch)/num_to_fetch, extras="last_update")['photos'][0]

                """Store photo ids returned by search"""
                for photo in photos['photo']:
                    photo_ids[photo['id']] = photo['lastupdate']
                
    for photo_id in photo_ids.keys():
        try:
            matching_photo = Photo.objects.get(photo_id=photo_id)
            """Match found, so remove it from list of bad photos"""
            photos_to_prune.remove(int(photo_id))
            if datetime.fromtimestamp(float(photo_ids[photo_id])) > matching_photo.last_updated:
                p = Photo(**get_photo_info(photo_id))
                p.id = matching_photo.id
                p.save()
                print 'Updated photo: ' + p.title
                update_pools(photo_id)
                
        except Photo.DoesNotExist:
            p = Photo(**get_photo_info(photo_id))
            p.save()
            print 'Created photo: ' + p.title
            update_pools(photo_id) 
    
    """Iterate through the sets_to_prune list, deleting any remaining sets from the local database"""            
    for photo_id in photos_to_prune:
        p = Photo.objects.get(photo_id=photo_id)
        print "Deleted photo: " + p.title
        p.delete()    
    
def get_photo_info(photo_id):
    photo_info = agent.flickr.photos.getInfo(photo_id=photo_id)['photo'][0]
    
    return_data = {}
    return_data['photo_id'] = photo_id
    return_data['secret'] = photo_info['secret']
    return_data['server'] = photo_info['server']
    return_data['is_favorite'] = photo_info['isfavorite']
    return_data['farm'] = photo_info['farm']
    return_data['original_secret'] = photo_info['originalsecret']
    return_data['views'] = photo_info['views']
    return_data['original_format'] = photo_info['originalformat']
    return_data['license'] = photo_info['license']
    return_data['title'] = photo_info['title'][0]['text']
    return_data['description'] = photo_info['description'][0]['text']
    return_data['is_public'] = photo_info['visibility'][0]['ispublic']
    return_data['is_family'] = photo_info['visibility'][0]['isfamily']
    return_data['is_friend'] = photo_info['visibility'][0]['isfriend']
    return_data['date_taken'] = photo_info['dates'][0]['taken']
    return_data['date_uploaded'] = datetime.fromtimestamp(float(photo_info['dateuploaded']))
    return_data['last_updated'] = datetime.fromtimestamp(float(photo_info['dates'][0]['lastupdate']))
    return_data['comments'] = photo_info['comments'][0]['text']
    return_data['photo_page'] = photo_info['urls'][0]['url'][0]['text']

    try:
        return_data['tags'] = ",".join(["%s" % (d['text']) for d in photo_info['tags'][0]['tag']])
    except:
        """No tags"""
        pass
        
    try:
        return_data['latitude'] = photo_info['location'][0]['latitude']
        return_data['longitude'] = photo_info['location'][0]['longitude']
        return_data['accuracy'] = photo_info['location'][0]['accuracy']
        return_data['locality'] = photo_info['location'][0]['locality'][0]['text']
        return_data['county'] = photo_info['location'][0]['county'][0]['text']
        return_data['region'] = photo_info['location'][0]['region'][0]['text']
        return_data['country'] = photo_info['location'][0]['country'][0]['text']
        return_data['geo_is_public'] = photo_info['geoperms'][0]['ispublic']
        return_data['geo_is_contact'] = photo_info['geoperms'][0]['iscontact']
        return_data['geo_is_friend'] = photo_info['geoperms'][0]['isfriend']
        return_data['geo_is_family'] = photo_info['geoperms'][0]['isfamily']
    except:
        pass
    
    try:
        """Try to get the EXIF data"""
        """EXIF comes back a little funky, so it needs to be reformatted"""
    
        """Get EXIF data"""
        exif_info = agent.flickr.photos.getExif(photo_id=photo_id)['photo'][0]
    
        """Define a sortkey, in this cast tag number"""
        def sortkey(item):
            return item.get("tag")
        
        """Sort the list"""
        exif_info['exif'].sort(key=sortkey)
        exif_info['exif'].reverse() # often two apertures, the second one normally includes only raw.  reverse so the second one contains clean too
        
        """Create a dict to be used to hold sorted, relevant data"""
        exif_info_sorted = {}
    
        """Pull out raw and clean tags.  Every attribute has raw, some have clean"""
        for d in exif_info['exif']:
            try:
                exif_info_sorted[d['label']] = {'raw': d['raw'], 'clean': d['clean']}
            except:
                try:
                    exif_info_sorted[d['label']] = {'clean': d['clean']}
                except:
                    exif_info_sorted[d['label']] = {'raw': d['raw']}
    
        
        def get_exif(attribute):
            """Return the selected attribute.
            This function will always try to return the clean attribute, but if it can't, it will return the raw version"""
            try:
                return exif_info_sorted[attribute]['clean'][0]['text']
            except:
                return exif_info_sorted[attribute]['raw'][0]['text']
        
        def add_data(django_attr, flickr_attr):
            
            try:
                return_data[django_attr] = get_exif(flickr_attr)
            except:
                pass
                
        add_data('exif_make', 'Make')
        add_data('exif_model','Model')
        add_data('exif_orientation','Orientation')
        add_data('exif_exposure','Exposure')
        add_data('exif_software','Software')
        add_data('exif_aperture','Aperture')
        add_data('exif_exposure_program','Exposure Program')
        add_data('exif_iso','ISO Speed')
        add_data('exif_metering_mode','Metering Mode')
        add_data('exif_flash','Flash')
        add_data('exif_focal_length','Focal Length')
        add_data('exif_color_space','Color Space')
        
    
    except:
        """No EXIF data, so just move along"""
        pass   
    
    return return_data

def update_photos_sets():
    sets = Set.objects.all()
    for s in sets:
        set_photos = agent.flickr.photosets.getPhotos(photoset_id=str(s.set_id))['photoset'][0]
        
        photos_to_prune = []
        current_photos = s.photo_set.all()
        for current_photo in current_photos:
            photos_to_prune.append(current_photo.photo_id)
                
        for p_id in set_photos['photo']:
            try:
                photos_to_prune.remove(p_id['id'])
            except:
                # not in set
                pass
                
            p = Photo.objects.get(photo_id=p_id['id'])
            if not (p.sets.filter(set_id=s.set_id)):
                p.save()
                p.sets.add(Set.objects.get(set_id=s.set_id))   
                print 'Added ' + p.title + ' to Set ' + s.title
        
        """Iterate through the sets_to_prune list, deleting any remaining sets from the local database"""            
        for photo in photos_to_prune:
            p = Photo.objects.get(photo_id=photo)
            p.sets.remove(Set.objects.get(set_id=s.set_id))
            print "Removed photo: " + p.title + " from Set " + s.title

def update_pools(photo_id):
    photo_context = agent.flickr.photos.getAllContexts(photo_id=photo_id)
    
    try:
        for pool in photo_context['pool']:
            """Try to find a pool in the database which has the same set id as given to us by flickr"""
            try:
                matching_pool = Pool.objects.get(pool_id=pool['id'])
            
                """Found matching pool, associate it with this photo"""
                p = Photo.objects.get(photo_id=photo_id)
                p.save()
                p.pools.add(matching_pool)
                print "Added " + p.title + " to Pool " + matching_pool.title
                                
            except Pool.DoesNotExist:
                """Exception thrown by above try meaning there is no record of an object with the set id give by flickr"""
                """Create a new set from the info given to us by flickr and save it to the database"""
                pool = Pool(pool_id=pool['id'],
                title=pool['title'],
                )
                pool.save()
                print 'Created pool: ' + p.title
                """Add photo to pool"""
                p = Photo.objects.get(photo_id=photo_id)
                p.save()
                p.pools.add(matching_pool)
                print "Added " + p.title + " to Pool " + matching_pool.title
    except:
        """No pool"""
        pass
    
if __name__ == '__main__':
    update_sets()
    update_photos()
    update_photos_sets()

"""
END OF FLICKRUPDATE
"""

"""
START DJANGO PHOTO MODEL
Requires django-tagging (http://code.google.com/p/django-tagging/)
"""
from django.db import models
from tagging.fields import TagField

FLICKR_LICENSES = (
    ('0', 'All Rights Reserved'),
    ('1', 'Attribution-NonCommercial-ShareAlike License'),
    ('2', 'Attribution-NonCommercial License'),
    ('3', 'Attribution-NonCommercial-NoDerivs License'),
    ('4', 'Attribution License'),
    ('5', 'Attribution-ShareAlike License'),
    ('6', 'Attribution-NoDerivs License'),
)

class Set(models.Model):
    title = models.CharField(maxlength=512)
    description = models.TextField(blank=True)
    set_id = models.CharField(maxlength=100) 
    primary = models.CharField(maxlength=512)
    secret = models.CharField(maxlength=512)
    server = models.IntegerField()
    farm = models.IntegerField()
    photos = models.IntegerField()
    
    class Admin:
        list_display = ('title', 'description', 'set_id')
        search_fields = ['title', 'description']
        
    def __str__(self):
        return '%s' % (self.title)
        
class Pool(models.Model):
    title = models.CharField(maxlength=512)
    pool_id = models.CharField(maxlength=100)
    
    class Admin:
        list_display = ('title', 'pool_id')
        search_fields = ['title', 'description']
        
    def __str__(self):
        return '%s' % (self.title)

class Photo(models.Model):
    photo_id = models.CharField(maxlength=100) 
    secret = models.CharField(maxlength=512)
    server = models.IntegerField()
    is_favorite = models.BooleanField()
    farm = models.IntegerField()
    original_secret = models.CharField(maxlength=512)
    views = models.IntegerField()
    original_format = models.CharField(maxlength=512)
    license = models.CharField(maxlength=10, choices=FLICKR_LICENSES)
    title = models.CharField(maxlength=512)
    description = models.TextField(blank=True)
    is_public = models.BooleanField()
    is_friend = models.BooleanField()
    is_family = models.BooleanField()
    date_taken = models.DateTimeField()
    date_uploaded = models.DateTimeField()
    last_updated = models.DateTimeField()
    comments = models.IntegerField()
    photo_page = models.URLField()
    tags = TagField(blank=True)
    sets = models.ManyToManyField(Set, blank=True)
    pools = models.ManyToManyField(Pool, blank=True)
    latitude = models.CharField(maxlength=512, blank=True)
    longitude = models.CharField(maxlength=512, blank=True)
    accuracy = models.IntegerField(blank=True, null=True)
    locality = models.CharField(maxlength=512, blank=True)
    county = models.CharField(maxlength=512, blank=True)
    region = models.CharField(maxlength=512, blank=True)
    country = models.CharField(maxlength=512, blank=True)
    geo_is_public = models.BooleanField()
    geo_is_contact = models.BooleanField()
    geo_is_friend = models.BooleanField()
    geo_is_family = models.BooleanField()
    exif_make = models.CharField(maxlength=512, blank=True)
    exif_model = models.CharField(maxlength=512, blank=True)
    exif_orientation = models.CharField(maxlength=512, blank=True)
    exif_exposure = models.CharField(maxlength=512, blank=True)
    exif_software = models.CharField(maxlength=512, blank=True)
    exif_aperture = models.CharField(maxlength=512, blank=True)
    exif_exposure_program = models.CharField(maxlength=512, blank=True)
    exif_iso = models.CharField(maxlength=512, blank=True)
    exif_metering_mode = models.CharField(maxlength=512, blank=True)
    exif_flash = models.CharField(maxlength=512, blank=True)
    exif_focal_length = models.CharField(maxlength=512, blank=True)
    exif_color_space = models.CharField(maxlength=512, blank=True)
    
    class Admin:
        list_display = ('title', 'is_public', 'is_family', 'is_friend', 'date_taken', 'last_updated')
        list_filter = ('date_taken', 'last_updated', 'is_public', 'is_family', 'is_friend')
        search_fields = ['title', 'description']
    
    def __str__(self):
        return '%s' % (self.title)
    

More like this

Comments

reid (on October 7, 2007):

I've been working to get this up and running on my site and I've noticed a few things that might be useful to others.

  • In some databases engines (MySQL for me) the int field type isn't large enough to hold photoset ids. Changing it to a bigint works.
  • The photo pruning code here really only works if you are importing all of your photos. Otherwise, it prunes away any images that it doesn't download from flickr.
  • I had some issues with the page calculation code. Since the first page was being fetched using recentlyUpdated and the subsequent pages were being fetched with search, there were photo ids that happened to be in both the page 1 and page 2 sets. This made it so that (len(photo_ids)+num_to_fetch)/num_to_fetch) never reached 3 and there was an infinite loop.

#

reid (on October 7, 2007):

importing tags seems to be broken (maybe flickr fixed their XML...)

return_data['tags'] = ",".join(["%s" % (d['text']) for d in photo_info['tags'][0]['tag'][0]])

should be

return_data['tags'] = ",".join(["%s" % (d['text']) for d in photo_info['tags'][0]['tag']])

#

bretwalker (on November 3, 2007):

I changed int to largeint, so it should work fine with MySQL now.

I also modified the first page load via the search api, rather than the recenltyupdated api.

I also updated the tag code. Thanks for the catch.

Unfortunately, if you delete a photo, there's no way to be notified via the API, so you have to do a full comparison between the Django database and Flickr.

#

kbpeterson (on February 26, 2008):

Is there a typo in your version of flickrlib.py on line 109? I'm having a tough time getting this to work. I keep getting a fault in xmlrpclib: <Fault 1: 'User not found'>. Any help would be greatly appreciated.

#

kbpeterson (on February 26, 2008):

Sorry my last comment left out the error: xmlrpclib.Fault: Fault 1: 'User not found'

Am I missing some kind of configuration?

#

stranger1 (on March 7, 2008):

Hi thank you for the great piece of code. i am new to django and I have installed this app. I am able to see the photo model in admin panel. but is there a way to sync the flickr automatically like cron job. I think the flickrupdate.py is one such but how to run that file. I tried $python flickrupdate.py but it gives me errors like Traceback (most recent call last): File "apps/photos/flickrupdate.py", line 8, in <module> from apps.photos.models import Photo ImportError: No module named apps.photos.models

After setting PYTHONPATH and DJANGO settings alos I get this ould not import settings 'mysite.settings' (Is it on sys.path? Does it have syntax errors?): No module named mysite.settings

#

Please login first before commenting.