This time, it is described for my memo. If you find it helpful, please use it as well.
The source code used this time is "Technical book of TomoProg" and "Resize images at once using Python, Pillow". (How to enlarge / reduce) " was used as a reference. Please check the above web page for more detailed explanations.
Then, I will describe the environment I used and the impression I used.
windows7 python 3.5 pycharm
import urllib.request
import bs4
#URL of the web page you want to get
url = "https://www.google.co.jp/"
request = urllib.request.urlopen(url)
html = request.read()
#Create a list of character codes
encoding_list = ["cp932", "utf-8", "utf_8", "euc_jp",
"euc_jis_2004", "euc_jisx0213", "shift_jis",
"shift_jis_2004", "shift_jisx0213", "iso2022jp",
"iso2022_jp_1", "iso2022_jp_2", "iso2022_jp_3",
"iso2022_jp_ext", "latin_1", "ascii"]
for enc in encoding_list:
try:
html.decode(enc)
break
except:
enc = None
resources = []
#Create a BeautifulSoup object
soup = bs4.BeautifulSoup(html)
#Get the contents of the src attribute in all html img tags
for img_tag in soup.find_all("img"):
src_str = img_tag.get("src")
resources.append(src_str)
#Show the contents of src
array_jpg = []
for resource in resources:
array_jpg.append(resource)
#Open the URL of the image file
#(Specify the URL of the image file in url)
count = 0
for number in range(0, len(array_jpg)):
request = urllib.request.urlopen(array_jpg[number])
#Open the file in binary mode and write the contents of the URL
#File names are serial numbers(Example: 0.jpg/1.jpg/......)
f = open("%d.jpg " % (count), "wb")
f.write(request.read())
#Close file
f.close()
count += 1
#coding:utf-8
from PIL import Image
import os
input_path = "C:\\Users\\image"
output_path = "C:\\Users\\image_480x300"
#Get the file name in the image folder
list_input_path = os.listdir(input_path)
for number in range(0, len(list_input_path)):
#Open image file
img = Image.open(input_path + "/" + list_input_path[number], 'r')
# img.resize((480, 300), Image.LANCZOS)Is the size setting to resize, the filter setting
img_resize_lanczos = img.resize((480, 300), Image.LANCZOS)
img_resize_lanczos = img_resize_lanczos.convert("RGB")
#Save resized image
img_resize_lanczos.save(output_path + "/" + list_input_path[number], quality = 100)
On the above website, the explanations were carefully written and there was no duplication. It was a very well organized site.
I think you will need some data when you want to do machine learning. In such a case, if you have this kind of knowledge, you can immediately collect data and start machine learning. I also collected a lot of images using this source code, so I would like to use it for machine learning. Please be careful about the copyright of the image.
Recommended Posts