Bulk download images from specific site URLs with python

Judging whether the acquired URL is a relative bus or an absolute path, and if it is a relative path, do not do the work of making it an absolute path this time Please note that the img path is a program created on the condition that only the absolute path is used on the site you want to acquire, so if you try to acquire an image from a site that uses a relative path, an ERROR will occur. ~~ I will write down the detailed explanation (?) Of the code in the blog linked below. ~~ (The blog has been released) (Scheduled as of August 11, 2014)

`downloadImg.py`


# -*- coding: utf-8 -*- 

import urllib
import urllib2
import os.path
import sys
from HTMLParser import HTMLParser

def download(url):
    img = urllib.urlopen(url)
    localfile = open(os.path.basename(url),'wb')
    localfile.write(img.read())
    img.close()
    localfile.close()

class imgParser(HTMLParser):

    def __init__(self):
        HTMLParser.__init__(self)

    def handle_starttag(self,tagname,attribute):
        if tagname.lower() == "img":
            for i in attribute:
                if i[0].lower() == "src":
                    img_url=i[1]
                    #Creating a file that collects the URLs of the acquired photos
                    f = open("collection_url.txt","a")
                    f.write("%s\t"%img_url)
                    f.close()
        
if __name__ == "__main__":

    print('Enter the URL of the site where you want to get the photo.')
    input_url = raw_input('>>>  ')
    serch_url = input_url
    htmldata = urllib2.urlopen(serch_url)
    
    print('Currently getting image files...')

    parser = imgParser()
    parser.feed(htmldata.read())

    parser.close()
    htmldata.close()

    #Read the generated file
    f = open("collection_url.txt","r")
    for row in f:
        row_url = row.split('\t')
        len_url = len(row_url)
    f.close()

    number_url = []

    for i in range(0,(len_url-1)):
        number_url.append(row_url[i])

    for j in range(0,(len_url-1)):
        url = number_url[j]
        download(url)

    print('The image download is complete.')

    #Delete file
    os.remove("collection_url.txt")

Twitter :@fantmsite ~~ Blog: Fantm Site-BLOG ~~

Recommended Posts

Bulk download images from specific site URLs with python

Bulk download images from specific URLs with python

Batch download images from a specific URL with python Modified version

Download images from URL list in Python

Scraping from an authenticated site with python

[Python] Download original images from Google Image Search

Convert PDFs to images in bulk with Python

Load images from URLs using Pillow in Python 3

Automatically download images with scraping

Bordering images with python Part 1

With skype, notify with skype from python!

Download csv file with python

Number recognition in images with Python

Get PowerShell commands from malware dynamic analysis site with BeautifulSoup + Python

Call C from Python with DragonFFI

Download images from "Irasutoya" using Scrapy

Using Rstan from Python with PypeR

Implemented file download with Python + Bottle

Install Python from source with Ansible

Create folders from '01' to '12' with python

I can't download images with Google_images_download

Extract text from images in Python

Post multiple Twitter images with python

Run Aprili from Python with Orange

Post images from Python to Tumblr

Animate multiple still images with Python

Load gif images with Python + OpenCV

Call python from nim with Nimpy

[Python] Collect images easily with icrawler!

Read fbx from python with cinema4d

Working with DICOM images in Python

Upload and download images with falcon

[Python] Try to recognize characters from images with OpenCV and pyocr

Download XBRL of securities report, quarterly report, financial report from EDINET / TDNET with Python

Collecting information from Twitter with Python (Twitter API)

Receive textual data from mysql with python

Get html from element with Python selenium

[Note] Get data from PostgreSQL with Python

Play audio files from Python with interrupts

Create wordcloud from your tweet with python3

Amplify images for machine learning with python

Read CSV file with python (Download & parse CSV file)

Exclude tweets containing URLs with tweepy [Python]

HTTP split download guy made with Python

Capturing images with Pupil, python and OpenCV

Tweet from python with Twitter Developer + Tweepy

Download Japanese stock price data with python

Business efficiency starting from scratch with Python

Decrypt files encrypted with openssl from python with openssl

Working with Azure CosmosDB from Python Part.2

Image acquisition from camera with Python + OpenCV

Download files on the web with Python

Horse Racing Site Web Scraping with Python

[python, openCV] base64 Face recognition with images

Getting started with Dynamo from Python boto

[Python] Read images with OpenCV (for beginners)

Try calling Python from Ruby with thrift

Get images from specific users on Twitter

Add Gaussian noise to images with python2.7

Easily download mp3 / mp4 with python and youtube-dl!

Use C ++ functions from python with pybind11