Introduction

Overview

A memo when I made a Python script to read Flickr photos in Photos | PyCon JP 2014 in TOKYO. According to the table below

What you want to do	Flickr API URI	Flickr API arguments
Get a list of albums	flickr.photosets.getList	user_id
Get the photos in the album	flickr.photosets.getPhotos	photoset_id
Get the URL of the photo	flickr.photos.getSizes	photo_id

Throw an HTTP request with requests
Parse the returned XML with xml.dom.minidom

I'm just doing two things. I think it was a little surprising that flickr.photos.getSizes contains the URL of the photo for each size.

There were already some articles doing something similar. Unlike this article, it is helpful to get the result in JSON and directly configure the URL instead of flickr.photos.getSizes. If the URL configuration rule is open, it is wise to configure it yourself rather than hitting the API.

Premise

The prerequisites are as follows:

requests must be installed
Have a Flickr account and API key

In addition, the following article will be helpful for the introduction and usage of requests.

How to use Requests (Python Library)

The following article will be helpful for getting the API key for Flickr.

Let's use Flickr API (1. Get API key)

code

At first I did it with Python2 and it didn't work, so I'm taking some first aid. It works with Python2 for the time being, but I may fix it later.

`flickr.py`


#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import requests
import xml.dom.minidom as md

class Flickr(object):
    u"""
Basic usage

    #Please correct accordingly
    api_key = 'ENTER_YOUR_API_KEY'
    
    f = Flickr(api_key)

    #pyconjp user_id
    print(f.get_photoset_ids_from_user_id('102063383@N02'))
    #PyCon JP 2014 day4 Sprint photoset photoset_id
    print(f.get_photos_from_photoset_id('72157647739640505'))
    #One of the photos in the above photo set
    print(f.get_url_from_photo_id('15274792845'))
    """

    def __init__(self, api_key):
        self.api_url = 'https://api.flickr.com/services/rest/'
        self.api_key = api_key

    def get_photoset_ids_from_user_id(self, user_id):
        u"""
user(user_id)List of album ids owned by(photset_List of ids)return it
Not necessary for this purpose
        """
        #Send request
        r = requests.post(self.api_url, {'api_key': self.api_key,
                                         'method': 'flickr.photosets.getList',
                                         'user_id': user_id
                                         })        

        #Parse xml into a dom object
        dom = md.parseString(r.text.encode('utf-8'))

        #photoset from dom object_Find id
        result = []        
        for elem in dom.getElementsByTagName('photoset'):
            result.append(elem.getAttribute('id'))            
        return result

    def get_photos_from_photoset_id(self, photoset_id):
        u"""
Album id(photset_id)List of photos in(photo_List of ids)return it
        """
        #Send request
        r = requests.post(self.api_url, {'api_key': self.api_key,
                                         'method': 'flickr.photosets.getPhotos',
                                         'photoset_id': photoset_id
                                         })  

        #Parse xml into a dom object
        dom = md.parseString(r.text.encode('utf-8'))

        #photo from dom object_Find id
        result = []
        for elem in dom.getElementsByTagName('photo'):
            result.append(elem.getAttribute('id'))            
        return result

    def get_url_from_photo_id(self, photo_id):
        u"""
Photo(photo_id)Returns the URL that is actually stored
        """
        #Send request
        r = requests.post(self.api_url, {'api_key': self.api_key,
                                         'method': 'flickr.photos.getSizes',
                                         'photo_id': photo_id
                                         })        

        #Parse xml into a dom object
        dom = md.parseString(r.text.encode('utf-8'))

        #Find the URL from the dom object
        result = None
        for elem in dom.getElementsByTagName('size'):
            #Only the original size
            if elem.getAttribute('label') == 'Original':
                result = elem.getAttribute('source')
                #Think of the original as one and skip the others
                break
            else:
                #None if nothing
                pass
        return result

if __name__ == '__main__':
    #Operation check
    
    #Please correct accordingly
    api_key = 'ENTER_YOUR_API_KEY'

    f = Flickr(api_key)
    
    #pyconjp user_id
    print(f.get_photoset_ids_from_user_id('102063383@N02'))
    #PyCon JP 2014 day4 Sprint photoset photoset_id
    print(f.get_photos_from_photoset_id('72157647739640505'))
    #One of the photos in the above photo set
    print(f.get_url_from_photo_id('15274792845'))

Execution result

Code execution result

Rewrite'ENTER_YOUR_API_KEY'in main to your API key

python flickr.py

You can get the URL of the following photo by executing.

https://www.flickr.com/photos/pyconjp/15274792845/in/set-72157647739640505

Example of use

To get the URLs in bulk, you can either import or copy flickr.py and do the following: Also here, change the ENTER_YOUR_API_KEY part to your own API key. Also, break is attached so that it does not take time when executing the test, so take break when actually executing.

If you don't want to specify targets, you can use get_photoset_ids_from_user_id to get all the albums of a specific user, and you may use that. This makes it look like crawling.

>>> from collections import defaultdict

>>> api_key = 'ENTER_YOUR_API_KEY'     #Please correct accordingly

>>> f = Flickr(api_key)

>>> targets = ['72157641092641983',    #PyCon JP 2014 preview- an album on Flickr
...            '72157647111767068',    # PyCon JP 2014 day1 Tutorial - an album on Flickr
...            '72157647184237569',    # PyCon JP 2014 day2 Conference - an album on Flickr
...            '72157647216509890',    # PyCon JP 2014 day3 Conference - an album on Flickr
...            '72157647739640505'     # PyCon JP 2014 day4 Sprint - an album on Flickr
...           ]

>>> d = {}
>>> for elem in targets:               #Create a list of photos in the photoset
...     d[elem] = f.get_photos_from_photoset_id(elem)
...     break    #take
    
>>> d2 = defaultdict(list)
>>> for k,v in d.items():              #Create url list of data
...     for elem in v:
...         d2[k].append(f.get_url_from_photo_id(elem))
...         break    #take
...     break    #take
    
>>> for k,v in d2.items():             #Get files using url list
...     if not os.path.exists(k):
...         os.mkdir(k)
...     for elem in v:
...         r = requests.get(elem)     #Get data from url
...         # photoset_id/file name.Save as jpg
...         with open("{0}/{1}".format(k, elem.split("/")[-1]), 'wb') as g:
...             g.write(r.content)

from now on

Not much content, but may write a description of individual features.
The code itself has a lot of room for improvement, so please give me some advice.

Use the Flickr API from Python