Deep Learning has become easier to reach due to increasing demand and increasing frameworks. but, It is troublesome to prepare a data set!
Since thousands or tens of thousands of data are required, it is very difficult to collect unless someone prepares it like handwritten characters. Therefore, expand from a small number of images to multiple images and increase them! There is a mechanism called.
I myself had to prepare the data set myself, so I was looking into it, so I will write it down as a memo. If you have any mistakes or better ways, please let us know!
I also had a penance to shoot more than 1,000 of my own hands, As a result of increasing the number to 54,000 by padding with this expansion, the directory is a little horror. I was worried that my acquaintance might be sick.
The author's environment is as follows.
If you want to make sure it works, create an environment with ʻanacondaor
pyenv` and run it from there.
Python 3.5.3
Keras==2.0.4
numpy==1.12.1
tensorflow==1.0.0
keras
runs on the TensorFlow
backend.
If you want to get a rough idea of Keras
, please see Past Articles.
It's really rough, so if you want to know more details, please refer to Official Document etc.
I will use this Keras
this time,
Since it is just for the purpose of preparing a dataset, please use it for your favorite framework such as keras
, machine learning, chainer
, TensorFlow
, and caffe
.
Perhaps every framework has similar functionality, but if you find it cumbersome to look up and want to increase it quickly, just try the code below.
Simply move it within the dataset's directory to extend the dataset and save it in another directory. Please use it according to the purpose such as how many sheets to expand from one sheet and the output directory name.
By default, all jpg
files in the current directory are expanded by 10 each and output to ./extened
(otherwise they will be created).
dataset_generator.py
import os
import glob
import numpy as np
from keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array, array_to_img
def draw_images(generator, x, dir_name, index):
#Output file settings
save_name = 'extened-' + str(index)
g = generator.flow(x, batch_size=1, save_to_dir=output_dir, save_prefix=save_name, save_format='jpg')
#Specify how many images to expand from one input image
# g.next()Is expanded by the number of times
for i in range(10):
bach = g.next()
if __name__ == '__main__':
#Output directory settings
output_dir = "extended"
if not(os.path.exists(output_dir)):
os.mkdir(output_dir)
#Loading images to be expanded
images = glob.glob(os.path.join('./', "*.jpg "))
#Settings for expansion
generator = ImageDataGenerator(
rotation_range=90, #Rotate up to 90 °
width_shift_range=0.1, #Randomly shift horizontally
height_shift_range=0.1, #Randomly shift vertically
channel_shift_range=50.0, #Randomly change the color tone
shear_range=0.39, #Diagonal direction(pi/Up to 8)Pull to
horizontal_flip=True, #Randomly flipped vertically
vertical_flip=True #Randomly flipped horizontally
)
#Expand the loaded images in order
for i in range(len(images)):
img = load_img(images[i])
#Arrange and transpose images a
x = img_to_array(img)
x = np.expand_dims(x, axis=0)
#Image extension
draw_images(generator, x, output_dir, i)
generator = ImageDataGenerator(
rotation_range=90,
width_shift_range=0.1,
height_shift_range=0.1,
channel_shift_range=50.0,
shear_range=0.39,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip=True
)
By rewriting this part, various extensions can be made. Please customize it according to your application.
The types and explanations of the arguments are as follows. The explanation is almost the same as Document, I don't know ...
--Easy to use and easy to understand
-- rotation_range
: Integer. Rotation range (0-180) that randomly rotates the image
--width_shift_range
: Floating point number (ratio to width). Randomly shifts horizontally within the specified range.
--height_shift_range
: Floating point number (ratio to height). Randomly vertically shifts within the specified range.
--shear_range
: Floating point number. Sheer strength (counterclockwise sheer angle (radians)). Perform Shear conversion. Briefly, it pulls randomly at an angle within the specified range.
--zoom_range
: Floating point number or [lower, upper]. Randomly zooms at a magnification within the specified value. Given a floating point number, [lower, upper] = [1-zoom_range, 1 + zoom_range].
--channel_shift_range
: Floating point number. Randomly shifts the channel within the range. The color will change.
--horizontal_flip
: Truth value. Randomly flips the input horizontally. It is a contraindication setting that should not be used for character recognition.
--vertical_flip
: Truth value. Randomly flips the input vertically. It is a contraindication setting that should not be used for character recognition.
--What you don't understand (I would appreciate it if you could explain it)
--featurewise_center
: Truth value. Average the inputs to 0 across the dataset.
--samplewise_center
: Truth value. Set the average of each sample to 0.
--featurewise_std_normalization
: Truth value. Normalizes the input with the standard deviation of the dataset.
--samplewise_std_normalization
: Truth value. Normalize each input with its standard deviation.
-- zca_whitening
: Truth value. Apply ZCA whitening.
-- zca_epsilon
: ZCA whitening epsilon. The default is 1e-6.
--fill_mode
: {"constant", "nearest", "reflect", "wrap"}. Fills the border of the input image according to the specified mode.
-- cval
: Floating point number or integer; value used when fill_mode = "constant".
--rescale
: rescale factor. The default is None, if specified, the given value will be added to the data before any other conversion.
--preprocessing_function
: The function applied to each input. This function will be executed before any other changes are made. This function must be defined to take a 3D numpy tensor as an argument and output a tensor of the same shape.
--data_format
: Specify "channels_last" (default) or "channels_first". In the case of "channels_last", the input shape will be (samples, height, width, channels), and in the case of "channels_first", it will be (samples, channels, height, width). The default is the value of image_data_format in the Keras configuration file ~ / .keras / keras.json. If you have never changed the value, it will be "channels_last".
It seems to be persistent many times, but I am still an inexperienced person, so please teach me various things!
Recommended Posts