A memo when I made a Python script to read Flickr photos in Photos | PyCon JP 2014 in TOKYO. According to the table below
What you want to do | Flickr API URI | Flickr API arguments |
---|---|---|
Get a list of albums | flickr.photosets.getList | user_id |
Get the photos in the album | flickr.photosets.getPhotos | photoset_id |
Get the URL of the photo | flickr.photos.getSizes | photo_id |
I'm just doing two things. I think it was a little surprising that flickr.photos.getSizes contains the URL of the photo for each size.
There were already some articles doing something similar. Unlike this article, it is helpful to get the result in JSON and directly configure the URL instead of flickr.photos.getSizes. If the URL configuration rule is open, it is wise to configure it yourself rather than hitting the API.
The prerequisites are as follows:
In addition, the following article will be helpful for the introduction and usage of requests.
The following article will be helpful for getting the API key for Flickr.
At first I did it with Python2 and it didn't work, so I'm taking some first aid. It works with Python2 for the time being, but I may fix it later.
flickr.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import requests
import xml.dom.minidom as md
class Flickr(object):
u"""
Basic usage
#Please correct accordingly
api_key = 'ENTER_YOUR_API_KEY'
f = Flickr(api_key)
#pyconjp user_id
print(f.get_photoset_ids_from_user_id('102063383@N02'))
#PyCon JP 2014 day4 Sprint photoset photoset_id
print(f.get_photos_from_photoset_id('72157647739640505'))
#One of the photos in the above photo set
print(f.get_url_from_photo_id('15274792845'))
"""
def __init__(self, api_key):
self.api_url = 'https://api.flickr.com/services/rest/'
self.api_key = api_key
def get_photoset_ids_from_user_id(self, user_id):
u"""
user(user_id)List of album ids owned by(photset_List of ids)return it
Not necessary for this purpose
"""
#Send request
r = requests.post(self.api_url, {'api_key': self.api_key,
'method': 'flickr.photosets.getList',
'user_id': user_id
})
#Parse xml into a dom object
dom = md.parseString(r.text.encode('utf-8'))
#photoset from dom object_Find id
result = []
for elem in dom.getElementsByTagName('photoset'):
result.append(elem.getAttribute('id'))
return result
def get_photos_from_photoset_id(self, photoset_id):
u"""
Album id(photset_id)List of photos in(photo_List of ids)return it
"""
#Send request
r = requests.post(self.api_url, {'api_key': self.api_key,
'method': 'flickr.photosets.getPhotos',
'photoset_id': photoset_id
})
#Parse xml into a dom object
dom = md.parseString(r.text.encode('utf-8'))
#photo from dom object_Find id
result = []
for elem in dom.getElementsByTagName('photo'):
result.append(elem.getAttribute('id'))
return result
def get_url_from_photo_id(self, photo_id):
u"""
Photo(photo_id)Returns the URL that is actually stored
"""
#Send request
r = requests.post(self.api_url, {'api_key': self.api_key,
'method': 'flickr.photos.getSizes',
'photo_id': photo_id
})
#Parse xml into a dom object
dom = md.parseString(r.text.encode('utf-8'))
#Find the URL from the dom object
result = None
for elem in dom.getElementsByTagName('size'):
#Only the original size
if elem.getAttribute('label') == 'Original':
result = elem.getAttribute('source')
#Think of the original as one and skip the others
break
else:
#None if nothing
pass
return result
if __name__ == '__main__':
#Operation check
#Please correct accordingly
api_key = 'ENTER_YOUR_API_KEY'
f = Flickr(api_key)
#pyconjp user_id
print(f.get_photoset_ids_from_user_id('102063383@N02'))
#PyCon JP 2014 day4 Sprint photoset photoset_id
print(f.get_photos_from_photoset_id('72157647739640505'))
#One of the photos in the above photo set
print(f.get_url_from_photo_id('15274792845'))
Rewrite'ENTER_YOUR_API_KEY'in main to your API key
python flickr.py
You can get the URL of the following photo by executing.
To get the URLs in bulk, you can either import or copy flickr.py and do the following: Also here, change the ENTER_YOUR_API_KEY part to your own API key. Also, break is attached so that it does not take time when executing the test, so take break when actually executing.
If you don't want to specify targets, you can use get_photoset_ids_from_user_id to get all the albums of a specific user, and you may use that. This makes it look like crawling.
>>> from collections import defaultdict
>>> api_key = 'ENTER_YOUR_API_KEY' #Please correct accordingly
>>> f = Flickr(api_key)
>>> targets = ['72157641092641983', #PyCon JP 2014 preview- an album on Flickr
... '72157647111767068', # PyCon JP 2014 day1 Tutorial - an album on Flickr
... '72157647184237569', # PyCon JP 2014 day2 Conference - an album on Flickr
... '72157647216509890', # PyCon JP 2014 day3 Conference - an album on Flickr
... '72157647739640505' # PyCon JP 2014 day4 Sprint - an album on Flickr
... ]
>>> d = {}
>>> for elem in targets: #Create a list of photos in the photoset
... d[elem] = f.get_photos_from_photoset_id(elem)
... break #take
>>> d2 = defaultdict(list)
>>> for k,v in d.items(): #Create url list of data
... for elem in v:
... d2[k].append(f.get_url_from_photo_id(elem))
... break #take
... break #take
>>> for k,v in d2.items(): #Get files using url list
... if not os.path.exists(k):
... os.mkdir(k)
... for elem in v:
... r = requests.get(elem) #Get data from url
... # photoset_id/file name.Save as jpg
... with open("{0}/{1}".format(k, elem.split("/")[-1]), 'wb') as g:
... g.write(r.content)
Recommended Posts