Get an image from a web page and resize it

This time, it is described for my memo. If you find it helpful, please use it as well.

The source code used this time is "Technical book of TomoProg" and "Resize images at once using Python, Pillow". (How to enlarge / reduce) " was used as a reference. Please check the above web page for more detailed explanations.

Then, I will describe the environment I used and the impression I used.

Development environment

windows7 python 3.5 pycharm

Image acquisition source code

import urllib.request
import bs4

#URL of the web page you want to get
url = "https://www.google.co.jp/"
request = urllib.request.urlopen(url)
html = request.read()

#Create a list of character codes
encoding_list = ["cp932", "utf-8", "utf_8", "euc_jp",
                 "euc_jis_2004", "euc_jisx0213", "shift_jis",
                 "shift_jis_2004", "shift_jisx0213", "iso2022jp",
                 "iso2022_jp_1", "iso2022_jp_2", "iso2022_jp_3",
                 "iso2022_jp_ext", "latin_1", "ascii"]

for enc in encoding_list:
    try:
        html.decode(enc)
        break
    except:
        enc = None

resources = []

#Create a BeautifulSoup object
soup = bs4.BeautifulSoup(html)

#Get the contents of the src attribute in all html img tags
for img_tag in soup.find_all("img"):
    src_str = img_tag.get("src")
    resources.append(src_str)

#Show the contents of src
array_jpg = []
for resource in resources:
    array_jpg.append(resource)

#Open the URL of the image file
#(Specify the URL of the image file in url)
count = 0
for number in range(0, len(array_jpg)):
    request = urllib.request.urlopen(array_jpg[number])

    #Open the file in binary mode and write the contents of the URL
    #File names are serial numbers(Example: 0.jpg/1.jpg/......)
    f = open("%d.jpg " % (count), "wb")
    f.write(request.read())

    #Close file
    f.close()
    count += 1

Image resizing source code

#coding:utf-8

from PIL import Image
import os

input_path = "C:\\Users\\image"
output_path = "C:\\Users\\image_480x300"

#Get the file name in the image folder
list_input_path = os.listdir(input_path)

for number in range(0, len(list_input_path)):
    #Open image file
    img = Image.open(input_path + "/" + list_input_path[number], 'r')

    # img.resize((480, 300), Image.LANCZOS)Is the size setting to resize, the filter setting
    img_resize_lanczos = img.resize((480, 300), Image.LANCZOS)
    img_resize_lanczos = img_resize_lanczos.convert("RGB")
    #Save resized image
    img_resize_lanczos.save(output_path + "/" + list_input_path[number], quality = 100)

Impressions of using

On the above website, the explanations were carefully written and there was no duplication. It was a very well organized site.

I think you will need some data when you want to do machine learning. In such a case, if you have this kind of knowledge, you can immediately collect data and start machine learning. I also collected a lot of images using this source code, so I would like to use it for machine learning. Please be careful about the copyright of the image.

Recommended Posts

Get an image from a web page and resize it
Get a Python web page, character encode it, and display it
[Personal memo] Get data on the Web and make it a DataFrame
A python program that resizes a video and turns it into an image
Crop Numpy.ndarray and save it as an image
I want to pass an argument to a python function and execute it from PHP on a web server
Extract data from a web page with Python
Try to get a web page and JSON file using Python's Requests library
Get mail from Gmail and label it with Python3
[Python3] Take a screenshot of a web page on the server and crop it further
A memo that detects and returns an image acquired from a webcam with Django's OpenCV
Let's use COTOHA and get a good message from Pokemon ▼
You who search and execute commands from a web browser 2
Get a global IP and export it to Google Spreadsheets
How to get a list of links from a page from wikipedia
(Memo) Until you extract only the part you want from a certain Web page, convert it to a Sphinx page, and print it as a PDF
I made a Line bot that guesses the gender and age of a person from an image
Try to extract a character string from an image with Python3
Get data from MySQL on a VPS with Python 3 and SQLAlchemy
Take an image with Pepper and display it on your tablet
How to get a job as an engineer from your 30s
Get OCTA simulation conditions from a file and save with pandas
Make a Santa classifier from a Santa image
Extract a page from a Wikipedia dump
Convert a string to an image
Python --Get bitcoin rate BTC / JPY from bitflyer at regular intervals and save it to a file
Clogged when getting data from DB and making it a return value
2. Make a decision tree from 0 with Python and understand it (2. Python program basics)
Get a capture of the entire web page in Selenium Python VBA
I used phantomjs from Python's selenium library and it became a zombie
Until you get a snapshot of Amazon Elasticsearch service and restore it
Tips: [Python] Randomly restore and extract an array from a fasta file
[Python] Create a linebot to write a name and age on an image
Start the webcam to take a still image and save it locally
Make a decision tree from 0 with Python and understand it (4. Data structure)
[python] Send the image captured from the webcam to the server and save it
Create a decision tree from 0 with Python and understand it (5. Information Entropy)
How to save the feature point information of an image in a file and use it for matching
Video acquisition / image shooting from a webcam
Get data from an oscilloscope with pyVISA
Get the address from latitude and longitude
Let's make an A to B conversion web application with Flask! From scratch ...
Generate an SSID and password, make it a QR code and throw it into Slack.
I want to write an element to a file with numpy and check it.
[Python] Concatenate a List containing numbers and write it to an output file.
What to do if Django can't load an image from a static folder
I made an image classification model and tried to move it on mobile
I tried to extract a line art from an image with Deep Learning
WEB scraping with python and try to make a word cloud from reviews