It's just a personal note. Create a Python program that downloads images quickly using a library called requests. ʻurllib.requests` seems to be useful in python3, but it seems that it can not be used in python2 (insufficient research), so I used this. You can set various things such as cookies, but create a simple program that you just access and download.

Official: python-requests

Installation

$ pip install requests

Try using it as a trial

$ python
>>> import requests
>>> url = "http://docs.python-requests.org/en/master/#"
>>> res = requests(url)
>>> res = requests.get(url)
>>> res.status_code
200
>>> res.headers["content-type"]
'text/html'
>>> res.content
'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"\n  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html xmlns="http://www.w3.org/1999/xhtml">\n  <head>\n...
>>> res.text  
u'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"\n  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html xmlns="http://www.w3.org/1999/xhtml">\n  <head>\n ...

How to use (excerpt)

See The User Guide for more information.

1. How to send a request

When the URL parameter is set, it is given in dictionary format to the argument params.

res = requests.get('http://httpbin.org/get', params={'key':'value'})

print(res.url)  #=> http://httpbin.org/get?key=value

In post and put, form information can be sent with the argument data.

res = requests.post('http://httpbin.org/post', data = {'key':'value'})
res = requests.put('http://httpbin.org/put', data = {'key':'value'})

Methods are provided according to the type of request.

res = requests.get('http://httpbin.org/get')
res = requests.post('http://httpbin.org/post', data = {'key':'value'})
res = requests.put('http://httpbin.org/put', data = {'key':'value'})
res = requests.delete('http://httpbin.org/delete')
res = requests.head('http://httpbin.org/get')
res = requests.options('http://httpbin.org/get')

2. Response processing

You can refer to the following variables.

res = requests.get('http://httpbin.org/get')

# HTML Status Code
response.status_code

#Response header Content-Examine Type
print res.header["content-type"] 

#Acquired data(binary)
print res.content

#Acquired data(Encoded)And encoding
print res.text
print res.encoding

Let's actually download the image

The input is a text file ʻinput.txt with a list of URLs, and the images are output to the output directory ʻimages / in the order of 0.jpg, 1.jpg, 2.jpg, .... In some places, weird code is mixed in because it's cute.

import requests
import os
import sys

#Download image
def download_image(url, timeout = 10):
    response = requests.get(url, allow_redirects=False, timeout=timeout)
    if response.status_code != 200:
        e = Exception("HTTP status: " + response.status_code)
        raise e

    content_type = response.headers["content-type"]
    if 'image' not in content_type:
        e = Exception("Content-Type: " + content_type)
        raise e

    return response.content

#Decide the file name of the image
def make_filename(base_dir, number, url):
    ext = os.path.splitext(url)[1] #Get extension
    filename = number + ext        #Add an extension to the number to make it a file name

    fullpath = os.path.join(base_dir, filename)
    return fullpath

#Save the image
def save_image(filename, image):
    with open(filename, "wb") as fout:
        fout.write(image)

#Main
if __name__ == "__main__":
    urls_txt = "input.txt"
    images_dir = "images"
    idx = 0

    with open(urls_txt, "r") as fin:
        for line in fin:
            url = line.strip()
            filename = make_filename(images_dir, idx, url)

            print "%s" % (url)
            try:
                image = download_image(url)
                save_image(filename, image)
                idx += 1
            except KeyboardInterrupt:
                break
            except Exception as err:
                print "%s" % (err)