[TF2.0 application] A case where general-purpose Data Augmentation was parallelized and realized at high speed with the strong data set function of the TF example.

Introduction

This article is the previous article "The story that the dataset function that can be used with TensorFlow was strong" "[[TF2.0 application] tf.data. It is a story of a further enhanced version of Data Augmentation that was raised a little in "Make Data Augmentation faster with Dataset" (https://qiita.com/Suguru_Toyohara/items/49c2914b21615b554afa).

While enhancing the speed with tf.data.Dataset and using the keras.preprocessing.image system ** We have succeeded in realizing code that can be processed in parallel. ** ** I will put the actual mechanism and the background to this point next to the code.

I'll put the code below

Environmental arrangement

First, let's prepare the experimental environment.

init


import tensorflow as tf
import tensorflow.keras as keras
import matplotlib.pyplot as plt
import sklearn
import numpy as np
from tqdm import tqdm
(tr_x,tr_y),(te_x,te_y)=keras.datasets.cifar10.load_data()
tr_x, te_x = tr_x/255.0, te_x/255.0
tr_y, te_y = tr_y.reshape(-1,1), te_y.reshape(-1,1)
model = keras.models.Sequential()
model.add(keras.layers.Convolution2D(32,3,padding="same",activation="relu",input_shape=(32,32,3)))
model.add(keras.layers.Convolution2D(32,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(32,3,padding="same",activation="relu"))
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Convolution2D(256,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(256,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(256,3,padding="same",activation="relu"))
model.add(keras.layers.GlobalAveragePooling2D())
model.add(keras.layers.Dense(1000,activation="relu"))
model.add(keras.layers.Dense(128,activation="relu"))
model.add(keras.layers.Dense(10,activation="softmax"))
model.compile(loss="sparse_categorical_crossentropy",metrics=["accuracy"])

An example of Data Augmentation

Data confirmation

First, let's express keras.preprocessing.image.random_rotate so that it can be done with .map.

random_rotate


from tensorflow.keras.preprocessing.image import random_rotation
from joblib import Parallel, delayed

def r_rotate(imgs, degree):
    pics=imgs.numpy()
    degree = degree.numpy()
    
    if tf.rank(imgs)==4:
        X=Parallel(n_jobs=-1)( [delayed(random_rotation)(pic, degree, 0, 1, 2) for pic in pics] )
        X=np.asarray(X)
    elif tf.rank(imgs)==3:
        X=random_rotation(pics, degree, 0, 1, 2)
    return X
@tf.function
def random_rotate(imgs, label):
    x = tf.py_function(r_rotate,[imgs,30],[tf.float32])
    X = x[0]
    X.set_shape(imgs.shape)
    return X, label

Now it actually works. Let's move it and see the data.

View data


labels = np.array([
    'airplane',
    'automobile',
    'bird',
    'cat',
    'deer',
    'dog',
    'frog',
    'horse',
    'ship',
    'truck'])
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(random_rotate)

plt.figure(figsize=(10,10),facecolor="white")
for b_img,b_label in tr_ds:
    for i, img,label in zip(range(25),b_img,b_label):
        plt.subplot(5,5,i+1)
        plt.xticks([])
        plt.yticks([])
        plt.grid(False)
        plt.imshow(img)
        plt.xlabel(labels[label])
    break
plt.show()

CIFAR10-random-rotate-sample.png

Speed test

Let's check how fast it actually will be. First of all, the speed in "[TF2.0 application] tf.data.Dataset to speed up Data Augmentation" was as follows.

result


Train on 50000 samples
50000/50000 [==============================] - 9s 175us/sample - loss: 2.3420 - accuracy: 0.1197
Train on 50000 samples
50000/50000 [==============================] - 7s 131us/sample - loss: 2.0576 - accuracy: 0.2349
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.7687 - accuracy: 0.3435
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.5947 - accuracy: 0.4103
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.4540 - accuracy: 0.4705
CPU times: user 1min 33s, sys: 8.03 s, total: 1min 41s
Wall time: 1min 14s

Next, I'll post the code and results from the previous implementation.

dataset


%%time
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000)
tr_ds = tr_ds.batch(tr_x.shape[0]).map(random_rotate).repeat(5)
tr_ds = tr_ds.prefetch(tf.data.experimental.AUTOTUNE)

for img,label in tr_ds:
    model.fit(x=img,y=label,batch_size=128)

result


Train on 50000 samples
50000/50000 [==============================] - 9s 176us/sample - loss: 1.3960 - accuracy: 0.5021
Train on 50000 samples
50000/50000 [==============================] - 9s 173us/sample - loss: 1.2899 - accuracy: 0.5430
Train on 50000 samples
50000/50000 [==============================] - 9s 175us/sample - loss: 1.2082 - accuracy: 0.5750
Train on 50000 samples
50000/50000 [==============================] - 9s 171us/sample - loss: 1.1050 - accuracy: 0.6133
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.0326 - accuracy: 0.6405
CPU times: user 52 s, sys: 15.4 s, total: 1min 7s
Wall time: 48.7 s
random_rotate_cpu_and_GPU_processing_rate

Is it a feeling that the preprocessing Map is working on the CPU while the GPU is running at up to 90%? Since it is 48.7 seconds in total, it can be shortened by about 25 seconds. Also, the time without Map was 35.1 seconds, so you can see that Data Augmentation can be done fairly quickly. And if you do it in the same way, you can ** all the keras.preprocessing.image system. ** **

Port Data Augmentation possible with Keras

Preparation

show_data


def show_data(tf_dataset):
    for b_img,b_label in tf_dataset:
        for i, img,label in zip(range(25),b_img,b_label):
            plt.subplot(5,5,i+1)
            plt.xticks([])
            plt.yticks([])
            plt.grid(False)
            plt.imshow(img)
            plt.xlabel(labels[label])
        break
    plt.show()

random_shift

You can specify up to what percentage of the shift at random.

random_shift


from tensorflow.keras.preprocessing.image import random_shift
from joblib import Parallel, delayed
def r_shift(imgs,wrg,hrg):
    pics=imgs.numpy()
    w = wrg.numpy()
    h = wrg.numpy()

    if tf.rank(imgs)==4:
        X=Parallel(n_jobs=-1)( [delayed(random_shift)(pic,w,h,0,1,2) for pic in pics] )
        X=np.asarray(X)
    elif tf.rank(imgs)==3:
        X=random_shift(pics, w,h, 0, 1, 2)
    return X
@tf.function
def tf_random_shift(imgs, label):
    x = tf.py_function(r_shift,[imgs,0.3,0.3],[tf.float32])
    X = x[0]
    X.set_shape(imgs.shape)
    return X, label

Data visualization


tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_shift)

plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)

CIFAR10-random-shift.png

random_shear

It can be distorted. (I don't know the details)

random_shear


from tensorflow.keras.preprocessing.image import random_shear

def r_shear(imgs,degree):
    pics=imgs.numpy()
    degree = degree.numpy()
    if tf.rank(imgs)==4:
        X=Parallel(n_jobs=-1)( [delayed(random_shear)(pic,degree,0,1,2) for pic in pics] )
        X=np.asarray(X)
    elif tf.rank(imgs)==3:
        X=random_shear(pics,degree,0,1,2)
    return X
@tf.function
def tf_random_shear(imgs, label):
    x = tf.py_function(r_shear,[imgs,30],[tf.float32])
    X = x[0]
    X.set_shape(imgs.shape)
    return X, label

Data confirmation


tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_shear)

plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)

CIFAR10-random-shear.png

random_zoom

Randomly zoom.

random_zoom


from tensorflow.keras.preprocessing.image import random_zoom

def r_zoom(imgs,range_w,range_h):
    pics=imgs.numpy()
    zoom_range = (range_w.numpy(),range_h.numpy())

    if tf.rank(imgs)==4:
        X=Parallel(n_jobs=-1)( [delayed(random_zoom)(pic,zoom_range,0,1,2) for pic in pics] )
        X=np.asarray(X)
    elif tf.rank(imgs)==3:
        X=random_zoom(pics,zoom_range,0,1,2)
    return X
@tf.function
def tf_random_zoom(imgs, label):
    x = tf.py_function(r_zoom,[imgs,0.5,0.5],[tf.float32])
    X = x[0]
    X.set_shape(imgs.shape)
    return X, label



Output result


tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_zoom)

plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)

CIFAR10-random-zoom.png

It looks like the same size ... Let's improve it.

Improvement

enhanced


from tensorflow.keras.preprocessing.image import random_zoom
import random
def zoom_range_gen(random_state):
    while True:
        x=random.uniform(random_state[0],random_state[1])
        yield (x,x)
def r_zoom(imgs):
    pics=imgs.numpy()
    random_state = [0.5,1.5]
    if tf.rank(imgs)==4:
        X=Parallel(n_jobs=-1)( [delayed(random_zoom)(pic,(x,y),0,1,2) for pic,(x,y) in zip(pics,zoom_range_gen(random_state))])
        X=np.asarray(X)
    elif tf.rank(imgs)==3:
        zoom_range=next(zoom_range_gen)
        X=random_zoom(pics,zoom_range,0,1,2)
    return X
@tf.function
def tf_random_zoom_enhanced(imgs, label):
    x = tf.py_function(r_zoom,[imgs],[tf.float32])
    X = x[0]
    X.set_shape(imgs.shape)
    return X, label

Let's check the data

Data confirmation


tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_zoom_enhanced)

plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)

CIFAR10-random-zoom-enhanced.png

It feels good! !!

Implement another Augmentation.

Next, implement the Augmentation in this blog "Data Augmentation Summary of Images in NumPy" Augmentation in Keras is Numpy based, so you can now implement Numpy based Augmentation.

I will quote the image of the cat from the blog "Data Augmentation summary of images in NumPy". I will quote the contents from this implementation. I will also write the source in the code.

random-flip

random_flip

Here

Let's implement random left-right reversal. This has already been implemented in the TF system, so we will use it.

random-flip


@tf.function
def flip_left_right(image,label):
    return tf.image.random_flip_left_right(image),label

@tf.function
def flip_up_down(image,label):
    return tf.image.random_flip_up_down(image),label


Data confirmation


tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128)
tr_ds = tr_ds.map(flip_left_right).map(flip_up_down)

plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)

CIFAR10-random-flip.png

random-clip

Here, we will use Scale Augmentation in Blog. Scale Augmentation

Here

For the implementation, I referred to the blog.

random-clip


from PIL import Image

def random_crop(pic, crop_size=(28, 28)):
    try:
        h, w, c = pic.shape
    except ValueError:
        raise ValueError("4Ds image can't decode")
    #Determine the upper left point of the image in the specified section
    top = np.random.randint(0, h - crop_size[0])
    left = np.random.randint(0, w - crop_size[1])

    #Determine the bottom right point to fit the size
    bottom = top + crop_size[0]
    right = left + crop_size[1]

    #Cut out only the intersection of the upper left point and the lower right point
    pic = pic[top:bottom, left:right, :]
    return pic

def scale_augmentation(pic, scale_range=(38, 80), crop_size=32):
    scale_size = np.random.randint(*scale_range)
    Ppic = Image.fromarray(pic)
    Ppic = Ppic.resize((scale_size,scale_size),resample=1)
    pic = np.asarray(Ppic)

    return random_crop(pic, (crop_size, crop_size))

def r_crop(imgs):
    pics=imgs.numpy()
    pics=np.asarray(pics * 255.0,dtype=np.uint8)

    random_state = (38,60)
    crop_size=32
    if tf.rank(imgs)==4:
        X=Parallel(n_jobs=-1)([delayed(scale_augmentation)(pic,random_state,crop_size) for pic in pics ])
        X=np.asarray(X)
    elif tf.rank(imgs)==3:
        X=scale_augmentation(pics,random_state,crop_size)
    
    X=X/255.0
    return X
@tf.function
def tf_random_crop(imgs, label):
    x = tf.py_function(r_crop,[imgs],[tf.float32])
    X = x[0]
    X.set_shape(imgs.shape)
    return X, label

Data confirmation


tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128)
tr_ds = tr_ds.map(tf_random_crop)

plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)

CIFAR10-random-crop.png

random-erasing

random-erasing

Here

Implement this. For the implementation, I referred to Blog.

random_erasing


def random_erasing(pic, p=0.5, s=(0.02, 0.4), r=(0.3, 3)):
    #Whether to mask or not
    if np.random.rand() > p:
        return pic

    #Randomly determine the pixel value to be masked
    mask_value = np.random.random()

    try:
        h, w, c = pic.shape
    except ValueError:
        raise ValueError("4Ds image can't decode")
    #Mask size s of original image(0.02~0.4)Randomly decide from the double range
    mask_area = np.random.randint(h * w * s[0], h * w * s[1])

    #Mask aspect ratio r(0.3~3)Randomly decide from the range of
    mask_aspect_ratio = np.random.rand() * r[1] + r[0]

    #Determine the mask height and width from the mask size and aspect ratio
    #Calculated height and width(Either)May be larger than the original image, so fix it
    mask_height = int(np.sqrt(mask_area / mask_aspect_ratio))
    if mask_height > h - 1:
        mask_height = h - 1
    mask_width = int(mask_aspect_ratio * mask_height)
    if mask_width > w - 1:
        mask_width = w - 1

    top = np.random.randint(0, h - mask_height)
    left = np.random.randint(0, w - mask_width)
    bottom = top + mask_height
    right = left + mask_width
    pic[top:bottom, left:right, :].fill(mask_value)
    return pic

def r_erase(imgs):
    pics=imgs.numpy()

    if tf.rank(imgs)==4:
        X=Parallel(n_jobs=-1)([delayed(random_erasing)(pic) for pic in pics ])
        X=np.asarray(X)
    elif tf.rank(imgs)==3:
        X=random_erasing(pics)
    
    return X
@tf.function
def tf_random_erase(imgs, label):
    x = tf.py_function(r_erase,[imgs],[tf.float32])
    X = x[0]
    X.set_shape(imgs.shape)
    return X, label

Data confirmation


tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128)
tr_ds = tr_ds.map(tf_random_erase)

plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)

CIFAR10-random-erase.png

With CIFAR10, I'm a little overwhelmed, and there are some things I don't understand ...

Important things to implement Augmentation for general purposes

Here are some things to keep in mind when writing code. It's a basic thing, so I'm sure some of you may think it's something. Surprisingly, there were only pitfalls, so I will write them down here.

About tf.data.Dataset.map

There are quite a few pitfalls here, but the behavior during mapping is the Tensor type. ... what I want to say is that ** Eager Execution does not work on Tensors handled by .map. ** ** In other words, normal multiplication etc. can be neatly converted to TF type operation with @ tf.function, Otherwise, operations that can only be done with real numbers, such as .numpy (), cannot be used **. ** ** I think it's easy to understand if you think that it is just described as an expression like x + y = z.

What to use there is to turn on Eager mode. What to do is to use ** tf.py_function. ** **

What are Eager Mode and Graph Mode in the first place?

Graph mode is like a formula. This is what I did with Sess.run in the TF1.x series. By designing something like the formula x + y = z and then assigning values to variables (that is, by running Session) 2 + 3 = 5 Z Tensor has a value of 5 for the first time. This is where the TF1.x system was difficult to understand.

Eager Mode is like inputting an expression and outputting the immediately executed value. It runs Sess.run automatically, and Graph is retained, so it seems easy to understand. (I'm not familiar with this area at all, so I hope you can refer to the official guide.) (Please tell me if you make a mistake)

About tf.py_function

tf.py_function is a function that can be partially executed in Eager Mode as described in Guide. Here, in other words, just set the black box function f (x) and specify only what comes out, and it will be executed in Graph Mode. It will be like that. As an expression, the TF side wants an expression such as x + f (a, b) = y, and what is needed here is the data type of the input and output.

py_Pseudocode for function


def function(Input 1,Input 2):
    #Here it runs in Eager Mode
    #Processing something
return output 1,Output 2

[Output 1,Output 2] = tf.py_function(function,[Input 1,Input 2],[Output 1の型,Output 2の型])

Specifically, it will be like this.

py_function


def function(data1,data2):
    return data1+data2,data1*data2
@tf.function
def process(tensor1,tensor2):
    [data1,data2]=tf.py_function(function,[tensor1,tensor2],[tf.float32,tf.float32])
    return data1, data2

In other words, the function function here is executed in Eager Mode at runtime, so it becomes tf.Tensor with a value in Tensor. tf.Tensor and Tensor are different in Eager Mode and Graph Mode, so be careful **

Behavior within the function specified by py_function

This is done in Eager Mode and tf.Tensor is brought in, so it's okay to do the first .numpy () and return the result in numpy. This is where many misunderstandings arise. tf.data.Dataset.map only works in Graph mode at first. On top of that, some need to run in Eager mode.

To summarize the whole in the figure

dataset-graph-mode

This seems to be the behavior of tf.data.Dataset.from_tensor_slices. (I'm sorry because it is not accurate information) And when data is discharged, it will be as follows.

dataset-eager-mode

If you code with this in mind, you will be able to code smoothly without being confused by mysterious bugs.

in conclusion

With this, I think I have been able to tell you how to develop a general-purpose Data Augmentation. I want to make something else like this! I hope that those who say that will make it in the same way. I am really relieved to solve the mystery of py_function. Please use all means.

Acknowledgments

This blog "Data Augmentation Summary of Images in NumPy" was very helpful in implementing it. I would like to take this opportunity to thank you.

Recommended Posts

[TF2.0 application] A case where general-purpose Data Augmentation was parallelized and realized at high speed with the strong data set function of the TF example.
[BigQuery] Load a part of BQ data into pandas at high speed
The result was better when the training data of the mini-batch was made a hybrid of fixed and random with a neural network.
I built an AWS Chalice development environment with docker and tried deploying a serverless application at super high speed