Classification of guitar images by machine learning Part 2

Summary

For the operating environment, etc., please refer to the previous post Classification of guitar images by machine learning, part 1.

** 2017/10/23 postscript: ** The content of this post seems to be similar to the method called Cutout [^ 1] / Random Erasing [^ 2].

** 2018/02/25 postscript: ** The induction of misidentification by adding noise is called hostile perturbation and seems to be actively studied. The topic of this article is simply "I'm in trouble because of low generalization performance", but in the study of hostile perturbation, I mainly talk about more skillful deception and how to protect from it (robustness). It is discussed from a security point of view. Regarding hostile perturbation, Very good summary has been published, so I will link it.

Task

In the challenge of Posted last time, a phenomenon was seen in which the classification prediction result changed completely with a slight graffiti on the input image. This means that if any part of the identification target is hidden in something, it will cause an unpredictable misclassification.

Human cognition is more robust, and in many cases it is possible to identify an object with comprehensive judgment even if part of the object is not visible. Isn't it possible to give more robustness to the discriminating ability of machine learning models?

Adding noise to the input image

Most of the learning data used in the previous post challenge was "beautiful" photos such as product images on EC sites. There are almost no images where the recognition target is partially hidden by something else. Under these ideal conditions, the class can be identified well by looking only at the local features (for example, only the layout of the pickup and controls), so the model of "capturing the features in a complex manner by looking at the whole" is I don't feel like growing up. (Overfits to ideal conditions and generalization performance cannot be obtained)

Then, let's make a composition that "you can't make a good distinction unless you look at the whole and capture the features in a complex way."

It's simple to do, and it randomly hides some of the training data. In this way, it will be difficult to classify only some local features, and inevitably, more global and complex features will be selectively learned.

This time, I added multiple rectangles to the training data with the following code.

def add_dust_to_batch_images(x):
    batch_size, height, width, channels = x.shape
    for i in range(0, batch_size):
        num_of_dust = np.random.randint(32)
        dusts = zip(
            (np.clip(np.random.randn(num_of_dust) / 4. + 0.5, 0., 1.) * height).astype(int), # pos x
            (np.clip(np.random.randn(num_of_dust) / 4. + 0.5, 0., 1.) * width).astype(int), # pos y
            np.random.randint(1, 8, num_of_dust), # width
            np.random.randint(1, 8, num_of_dust), # height
            np.random.randint(0, 256, num_of_dust)) # brightness
        for pos_y, pos_x, half_w, half_h, b in dusts:
            top = np.clip(pos_y - half_h, 0, height - 1)
            bottom = np.clip(pos_y + half_h, 0, height - 1)
            left = np.clip(pos_x - half_w, 0, width - 1)
            right = np.clip(pos_x + half_w, 0, width - 1)
            x[i, top:bottom, left:right, :] = b
    return x

# ...

noised_train_flow = ((add_dust_to_batch_images(x), y) for x, y in train_flow)

The number, position, size, and brightness of the rectangles are random. I think that the subject's guitar is often reflected near the center, so I try to distribute as many rectangles as possible near the center.

This is the actually processed image.

noise_demo.jpg

You can see that the body outline and part of the assembly are hidden by the added rectangle.

From the viewpoint of adding noise, we also considered inserting Dropout immediately after input, but this time the aim is to "hide local features" as mentioned above, so we decided that Dropout, which adds noise evenly to the whole, is unsuitable. Did.

Learning results

Let's train the model with noise added to the input. Similar to Last time, it is transfer learning using ResNet-50 that has already been ImageNet trained.

The transition of accuracy is like this.

noised_trans.png

Surprisingly, there is almost no effect on learning speed due to noise addition.

The best score is the 54th step, with a learning accuracy of 99.95% and a verification accuracy of 100%. Let's try the inference again using the snapshot at this point.

Let me infer this and that

Jazzmaster, Les Paul, and acoustic guitar without graffiti are the same good results as last time, so I will omit them.

Attention is the picture of the Jazzmaster with graffiti, which was judged to be "Flying V" for some reason last time. How about this time?

jm2.jpg

It has been improved successfully.

On the other hand, here is the change in the score, Duo Sonic.

ds.JPG

Last time it was judged as "Mustang", but this time it is "Stratocaster". As a result of capturing more global features, the shape of pickguards and bridges may have been taken into account.

Impressions

I feel that I have achieved my aim somehow. (suitable)

I think that what I'm trying this time is a semi-common sense technique in the academic field, but when I apply it to familiar subjects, it deepens my understanding and is interesting.

Recommended Posts

Classification of guitar images by machine learning Part 1
Classification of guitar images by machine learning Part 2
Machine learning classification
Machine learning algorithm (implementation of multi-class classification)
Judgment of igneous rock by machine learning ②
EV3 x Pyrhon Machine Learning Part 3 Classification
Machine learning memo of a fledgling engineer Part 1
Python & Machine Learning Study Memo ⑤: Classification of irises
Analysis of shared space usage by machine learning
Reasonable price estimation of Mercari by machine learning
Predict short-lived works of Weekly Shonen Jump by machine learning (Part 2: Learning and evaluation)
Predict short-lived works of Weekly Shonen Jump by machine learning (Part 1: Data analysis)
Basics of Machine Learning (Notes)
Supervised learning 1 Basics of supervised learning (classification)
Supervised machine learning (classification / regression)
4 [/] Four Arithmetic by Machine Learning
Performance verification of data preprocessing for machine learning (numerical data) (Part 2)
Predict the presence or absence of infidelity by machine learning
Try to evaluate the performance of machine learning / classification model
How to increase the number of machine learning dataset images
Performance verification of data preprocessing for machine learning (numerical data) (Part 1)
I tried to verify the yin and yang classification of Hololive members by machine learning
Significance of machine learning and mini-batch learning
Machine learning ③ Summary of decision tree
Classification and regression in machine learning
Have Hisako's guitar replaced with her own guitar by machine learning -Execution-
A memorandum of scraping & machine learning [development technique] by Python (Chapter 4)
A memorandum of scraping & machine learning [development technique] by Python (Chapter 5)
Low-rank approximation of images by HOSVD step by step
Low-rank approximation of images by Tucker decomposition
Machine learning algorithm (generalization of linear regression)
Predict power demand with machine learning Part 2
Deep learning learned by implementation 2 (image classification)
Making Sandwichman's Tale by Machine Learning ver4
[Learning memo] Basics of class by python
Amplify images for machine learning with python
[Machine learning] LDA topic classification using scikit-learn
Is it possible to eat by forecasting stock prices by machine learning [Machine learning part 1]
Face detection by collecting images of Angers.
2020 Recommended 20 selections of introductory machine learning books
[Failure] Find Maki Horikita by machine learning
Four arithmetic operations by machine learning 6 [Commercial]
Machine learning
Machine learning algorithm classification and implementation summary
Python learning memo for machine learning by Chainer Chapter 13 Basics of neural networks
Memorandum of means when you want to make machine learning with 50 images
[Machine learning] List of frequently used packages
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Python learning memo for machine learning by Chainer until the end of Chapter 2
Judge the authenticity of posted articles by machine learning (Google Prediction API).
Machine Learning: Image Recognition of MNIST by using PCA and Gaussian Native Bayes
Chapter 6 Supervised Learning: Classification pg212 ~ [Learn by moving with Python! New machine learning textbook]
I tried to predict the presence or absence of snow by machine learning.
Reconstruction of moving images by Autoencoder using 3D-CNN
Machine learning starting with Python Personal memorandum Part2
Beginning of machine learning (recommended teaching materials / information)
Try to forecast power demand by machine learning
Machine learning of sports-Analysis of J-League as an example-②
Basics of Supervised Learning Part 1-Simple Regression- (Note)
Machine learning starting with Python Personal memorandum Part1
Numerai Tournament-Fusion of Traditional Quants and Machine Learning-