PyTorch practice. This is the content up to the last time ↓ Implementation of simple regression analysis in Keras Wine classification by Keras Machine Sommelier by Keras- Dataset preparation for PyTorch
--Up to the last time, we have dealt with examples such as regression and classification using Keras. I also learned about the outline and implementation of machine learning and deep learning. ――This time, I will create a neural network that learns and classifies the images I have collected. (The backbone is selectable.) --The deep learning frameworks used are Keras and PyTorch, and the differences between the two are also compared. --Click here for the program ↓ (Execution environment is described in [Bottom of page](#Execution environment-environment)) (Data set is also available) GitHub-moriitkys/MyOwnNN
--For the data set, a hook wrench (62 sheets) and a spanner wrench (62 sheets) are collected and expanded for trial use as images for learning and evaluation (verification) (Figure 1-a, b). Tool classification.
Figure 1-a. Hook Wrench | Figure 1-b. Spanner Wrench |
--The input of your own NN (MyNet) is 28x28x3 and the output is 2, which is a classification problem. The network structure is detailed below. --The number of learnings is epoch, the optimization function is SGD, and the loss function is categorical cross entropy. --For test images (unknown images), prepare 2 hook wrenches and 2 spanner wrenches that are not used for learning and evaluation. --UI is the same as the one used in the previous Dataset preparation for PyTorch. ――As a bonus, I also tried to classify the logos of the two machine tool makers that continued from the previous time.
The self-made NN is called MyNet in this article. It is a network consisting of an input layer (28 * 28 * 3 nodes), an intermediate layer (200 nodes), and an output layer (2 outputs). This time, we have made it possible to consider 3 RGB channels. The conceptual diagram of the structure is Figure 2.
Figure 2.Conceptual diagram of MyNet |
In the middle layer, ReLU is applied as an activation function, and Dropout is also applied. Apply the softmax function as an activation function in the output layer, and output for each class (2) To get
Figure 3.Conceptual diagram of terms and learning in machine learning |
・ ** Neurons, Nodes </ font> ** The part that receives an input signal and outputs something. As shown in Figure 3, the rounded part is called a neuron (node), and it converts some function into an input signal and outputs an output signal.
・ ** Activation function </ font> **: ReLU, softmax A function that transforms each neuron (node) when it receives an output from an input. Something like $ f_ {()} $ shown in the figure.
Figure 4-a.Softmax function | Figure 4-b. ReLU | Figure 4-c.Sigmoid function |
・ ** Loss function </ font> **: categorical_crossentropy The loss value is the error between the value predicted by the neural network and the correct answer, and the function to find the error is the loss function. As shown in the figure, it is a function that calculates the error from the output of the model and the correct label.
・ ** Optimization function </ font> **: SGD The optimization function is a function that changes the weight so that the value of the loss function decreases. Calculate the gradient from the error and weight as shown and adjust the weight.
・ ** Keras ** A high-level neural network library written in Python that can be run on TensorFlow, CNTK, or Theano.
・ ** PyTorch ** It’s a Python-based scientific computing package targeted at two sets of audiences: ・ A replacement for NumPy to use the power of GPUs ・ A deep learning research platform that provides maximum flexibility and speed
See GitHub for the entire program. The following is an extraction of the MyNet part.
# Build a model
from keras.applications.mobilenet import MobileNet
from keras.applications.resnet50 import ResNet50
from keras.layers.pooling import GlobalAveragePooling2D
from keras.layers.core import Dense, Dropout, Flatten
from keras.models import Model, load_model, Sequential
from keras.optimizers import Adam, RMSprop, SGD
base_model = Sequential()
top_model = Sequential()
INPUT_SHAPE = (img_size[0], img_size[1], 3)
neuron_total = 500
elif type_backbone == "MyNet":
INPUT_SHAPE = (img_size[0], img_size[1], 3)
base_model.add(Dense(neuron_total, activation='relu',
input_shape=(INPUT_SHAPE[0]*INPUT_SHAPE[1]*INPUT_SHAPE[2],)))
base_model.add(Dropout(0.5))
top_model.add(Dense(nb_classes, activation='softmax',
input_shape=base_model.output_shape[1:]))
# Concatenate base_model(backbone) with top model
model = Model(input=base_model.input, output=top_model(base_model.output))
print("{}layer".format(len(model.layers)))
# Compile the model
model.compile(
optimizer = SGD(lr=0.001),
loss = 'categorical_crossentropy',
metrics = ["accuracy"]
)
model.summary()
See GitHub for the entire program. The following is an extraction of the MyNet part.
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import torchvision.models
from torchsummary import summary
neuron_total = 200
INPUT_SHAPE = (img_size[0], img_size[1], 3)
print(INPUT_SHAPE)
print(nb_classes)
# Create my model
class MyNet(nn.Module):
def __init__(self):
super().__init__()
self.l1 = nn.Linear(INPUT_SHAPE[0]*INPUT_SHAPE[1]*INPUT_SHAPE[2], neuron_total)# Input Layer to Intermediate modules
self.dropout1 = torch.nn.Dropout2d(p=0.5)
self.l2 = nn.Linear(neuron_total, 2) #Intermediate modules to Output Layer
def forward(self, x):#Forward propagation
x = x.view(-1, INPUT_SHAPE[0]*INPUT_SHAPE[1]*INPUT_SHAPE[2] ) # x.view : Transform a tensor shape. If the first argument is "-1", automatically adjust to the second argument.
x = self.l1(x)
x = self.dropout1(x)
x = self.l2(x)
return x
if type_backbone == "ResNet50":
model = Resnet()
elif type_backbone == "Mobilenet":
model = Mobilenet()
elif type_backbone == "MyNet":
model = MyNet()
model = model.to(device)
# Show the model
summary(model, ( 3, img_size[1], img_size[0]))#channel, w, h
First, as mentioned in Dataset preparation for PyTorch, it is said that Keras uses numpy format and PyTorch uses DataLoader and tensor format. The point is different.
Next, regarding how to make a model, Keras will automatically match the shape when connecting layers with Dense etc., but PyTorch must clarify it. For example, if you add some middle layers in Figure 2, Keras
base_model = Sequential()
top_model = Sequential()
INPUT_SHAPE = (img_size[0], img_size[1], 3)
base_model.add(Dense(neuron_total, activation='relu',
input_shape=(INPUT_SHAPE[0]*INPUT_SHAPE[1]*INPUT_SHAPE[2],)))
base_model.add(Dense(neuron_total, activation='relu'))
base_model.add(Dense(neuron_total, activation='relu'))
base_model.add(Dropout(0.5))
top_model.add(Dense(nb_classes, activation='softmax',
input_shape=base_model.output_shape[1:]))
# Concatenate base_model(backbone) with top model
model = Model(input=base_model.input, output=top_model(base_model.output))
In PyTorch
class MyNet2(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(INPUT_SHAPE[0]*INPUT_SHAPE[1]*INPUT_SHAPE[2], neuron_total)# Input Layer to Intermediate modules
self.fc2 = nn.Linear(neuron_total, int(neuron_total/2)) #Intermediate modules to Output Layer
self.dropout1 = torch.nn.Dropout2d(p=0.5)
self.fc3 = nn.Linear(int(neuron_total/2), 2)
def forward(self, x):#Forward propagation
x = x.view(-1, INPUT_SHAPE[0]*INPUT_SHAPE[1]*INPUT_SHAPE[2] ) # x.view : Transform a tensor shape. If the first argument is "-1", automatically adjust to the second argument.
x = self.fc1(x)
x = self.fc2(x)
x = F.relu(x)
x = self.dropout1(x)
x = self.fc3(x)
return x
In PyTorch, the number of nodes is specified for both input and output.
I'm worried because I don't know much about it, but Keras shouldn't need to switch Dropout application between learning and evaluation. In PyTorch, Dropout is disabled by model.eval (), so when loading the test image, it is clearly stated that it is not in learning mode
param = torch.load(weights_folder_path + "/" + best_weights_path)
model.load_state_dict(param, strict=False)
model.eval()
# ~ Inference
As you can see, the number of parameters matched exactly.
Figure 5.Keras by model summary(left)And PyTorch(right)comparison |
It's a small story, but in Keras you do not need to change the description when using GPU, but in the case of PyTorch
#image, label = Variable(image), Variable(label)
image, label = Variable(image).cuda(), Variable(label).cuda()
It needs to be rewritten as.
In Keras, by writing like model.fit, the learning evaluation loop is repeated for the number of epochs. In PyTorch, repeat for the number of epochs as follows in a for loop.
def train(epoch):
#~Abbreviation
def validation():
#~Abbreviation
for epoch in range(1, total_epochs + 1):
train(epoch)
validation()
Also, PyTorch uses log_softmax by default, so the total class probabilities will not be 1 (specify softmax or convert it yourself).
First, when I checked the operating status of the PC with Task Manager, there were the following differences.
Figure 6. Kera(left)And PyTorch(right)Task manager performance during each learning (per 10 epoch) |
The memory usage was small on the PyTorch side. Since Keras holds datasets in lists and numpy arrays (in this program), it inevitably consumes memory. GPU usage was also small on the PyTorch side.
Next, we will compare the learning execution speeds of the Keras and PyTorch networks. The table below summarizes the time [s] required for 40 epochs when trained using a network.
Keras | PyTorch | |
---|---|---|
ResNet | 3520 s | 3640 s |
Mobilenet | 1600 s | 1760 s |
MyNet | 40 s | 680 s |
Keras sets verbose = 1 in model.fit, so I'm looking at the seconds of the value that was output without permission. It is accurate to calculate from the time per step, but it is annoying, so it is an approximate value. From the table above, PyTorch is slightly slower (about 3 seconds slower to 1 epoch). Especially MyNet is quite slow. However, PyTorch is more energy efficient (?). I intended PyTorch to be faster, but I feel like the code is bad. I feel that PyTorch is better for saving energy at almost the same speed.
The estimated results of Loss, Accuracy, and test images of the trained results are summarized below. The learning curve is terrible, but the results are reasonably reasonable.
Figure 7.Loss and Accuracy (Keras) for epochs in learning |
Figure 8-a.Estimated result by ResNet50(Keras) |
Figure 8-b.Guess results by Mobilenet v1(Keras) |
Figure 8-c.Guess results by MyNet(Keras) |
The estimated results of Loss, Accuracy, and test images of the trained results are summarized below. Similar to Keras, so the result is shown in the fold.
Figure 9.Loss and Accuracy (PyTorch) for epochs in learning |
Figure 10-a.Estimated result by ResNet50 (PyTorch) |
Figure 10-b.Estimated result by Mobilenet v1 (PyTorch) |
Figure 10-c.Guess by MyNet (PyTorch) |
Both tend to be the same (because I tried to learn almost the same).
Both Keras and PyTorch can be classified by ResNet and Mobilenet, but not by MNIST level MyNet. However, looking at how Loss goes down, it seems that learning is not going well with ResNet and Mobilenet. This time, the test image is similar to the training data, so I think it was the correct answer. In the case of classification problems that are as similar as hook wrench and spanner wrench, it seems that the number of data is small with about 60 sheets. Moreover, I feel that even if all the data is available, it cannot be classified.
By the way, the result of learning the middle layer node with 500 and the number of learning with 100 epoch in MyNet is as follows.
Figure 11.The result of learning the middle layer node with 500 and the number of learning with 100 epoch with MyNet |
The loss value of Validation will not decrease. Perhaps it is a problem that cannot be classified by a simple neural network that is not deep. It is necessary to devise whether to increase the number of layers or use CNN (Convolutional Neural Network).
We will classify the logos of the two companies for the manufacturer's logo that appeared in the previous machine tool sommelier. There are differences in shape, but can it be classified as a neural network? I'll try this on MyNet. For learning and evaluation, I used the Makino Milling Co., Ltd. logo and Okuma logo collected online, and for the test, I used my own handwritten logo. The one I wrote myself.
Figure 12-a.Handwritten Makino Milling logo | Figure 12-b.Handwritten Okuma logo |
The transition of Loss and Accuracy is as follows.
Figure 13-a.Transition of Loss with respect to Epoch | Figure 13-b.Transition of Accuracy with respect to Epoch |
I think you're learning better than learning a hook wrench and a spanner wrench. I guess it looks like this:
Figure 14-a.Makino Milling Logo Guess Results | Figure 14-b.Okuma logo guess result |
This result is very well classified. If there is a difference in shape like a logo, it seems that it is possible to classify even a neural network that is not deep.
--Hook wrench and spanner wrench cannot be classified by a simple neural network --If it is a corporate logo, it can be classified even if it is not deep
https://keras.io/ja/ https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html https://qiita.com/sheep96/items/0c2c8216d566f58882aa https://rightcode.co.jp/blog/information-technology/pytorch-mnist-learning https://water2litter.net/rum/post/pytorch_tutorial_classifier/ https://qiita.com/jyori112/items/aad5703c1537c0139edb https://pystyle.info/pytorch-cnn-based-classification-model-with-fashion-mnist/ https://pytorch.org/docs/stable/torchvision/models.html https://qiita.com/perrying/items/857df46bb6cdc3047bd8 https://qiita.com/sakaia/items/5e8375d82db197222669 https://discuss.pytorch.org/t/low-accuracy-when-loading-the-model-and-testing/44991/5
Recommended Posts