Hello, this is Kawashima! I've been thinking about posting on Qiita for a long time, and now I can do it! (^^)
Today, I would like to explain a program that trains MNIST handwritten digit data using a neural network with PyTorch.
The topic itself is not new, However, in the articles so far, there are few things that explain the details carefully.
In this article, I would like to write it in as much detail as possible.
That said, most of the explanation is in the comments. Be patient and read the comments line by line!
Let's start while looking at the source code!
# -*- coding: utf-8 -*-
# -----------------------------------------------------------------------------
import torch
print(torch.__version__)
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST
from torch.autograd import Variable
import torch.nn as nn
import torch.optim as optimizer
For studying, try running the program while changing here.
# -----------------------------------------------------------------------------
#Batch size of mini batch
BATCH_SIZE = 4
#Maximum number of learnings
MAX_EPOCH = 2
#Number of batches to output progress
PROGRESS_SHOW_PER_BATCH_COUNT=1000
Here, we define a neural network with a three-layer structure. Note that this is not a convolutional neural network.
# -----------------------------------------------------------------------------
#Definition of multi-layer perceptron class
class MLP(nn.Module):
def __init__(self):
'''
For example, the activation function defined for each layer is the next forward.()Defined in
'''
super().__init__()
#Input layer
self.layer1 = nn.Linear(28 * 28, 100)
#Intermediate layer (hidden layer)
self.layer2 = nn.Linear(100, 50)
#Output layer
self.layer3 = nn.Linear(50, 10)
def forward(self, input_data):
'''
Definition of network (forward propagation) (connecting)
'''
# input_Convert data to format
# -1 converts automatically
input_data = input_data.view(-1, 28 * 28)
#Input from the previous layer_pass data to layer1
input_data = self.layer1(input_data)
#Input from the previous layer_pass data to layer2
input_data = self.layer2(input_data)
#Input from the previous layer_pass data to layer3
input_data = self.layer3(input_data)
return input_data
#Create an instance of the training model
model = MLP()
# -----------------------------------------------------------------------------
#Prepare the training data
#
print('----------Preparation of learning data----------')
data_folder = '~/data'
transform = transforms.Compose([
#Convert data type to Tensor
transforms.ToTensor()
])
#Training data
train_data_with_labels = MNIST(
data_folder, train=True, download=True, transform=transform)
train_data_loader = DataLoader(
train_data_with_labels, batch_size=BATCH_SIZE, shuffle=True)
#Validation data
test_data_with_labels = MNIST(
data_folder, train=False, download=True, transform=transform)
test_data_loader = DataLoader(
test_data_with_labels, batch_size=BATCH_SIZE, shuffle=True)
Specify the loss function, learning rate, etc. for learning. In this area, it is premised on understanding what a neural network is. If you would like to check the prerequisite knowledge again, please refer to my note.
https://note.com/kawashimaken/n/nfeebd2502b87?magazine_key=me13f2d6e0ab8
# -----------------------------------------------------------------------------
#Get ready for learning
#The loss function uses the cross entropy error function
lossResult = nn.CrossEntropyLoss()
# SGD
optimizer = optimizer.SGD(model.parameters(), lr=0.01)
print('----------Start learning----------')
#Start learning
for epoch in range(MAX_EPOCH):
#Initial setting of error
total_loss = 0.0
#enumerate decomposes the index data
for i, data in enumerate(train_data_loader):
#Extract a batch of training target data and teacher label data from data
train_data, teacher_labels = data
#Input torch.autograd.Convert to Variable
train_data, teacher_labels = Variable(train_data), Variable(
teacher_labels)
#Delete (reset, clear) the calculated gradient information
optimizer.zero_grad()
#Give the model training data to make predictions
outputs = model(train_data)
#Derivative calculation by loss and w
loss = lossResult(outputs, teacher_labels)
#Calculate the gradient
loss.backward()
#Perform the optimization step once (update the parameters, a common process for many optimizers)
optimizer.step()
# loss.item()Converts loss to a number, accumulates errors
total_loss += loss.item()
# PROGRESS_SHOW_PER_BATCH_COUNT Shows progress for each mini-batch
if i % PROGRESS_SHOW_PER_BATCH_COUNT == PROGRESS_SHOW_PER_BATCH_COUNT-1:
print('i=',i)
print(
'Learning progress:[EPOCH:%d, %d batch x%d -> %d sheets learning completed]Learning error (loss): %.3f' % (epoch + 1, i + 1, BATCH_SIZE, (i + 1) * BATCH_SIZE,
total_loss / PROGRESS_SHOW_PER_BATCH_COUNT))
#Reset the calculation error
total_loss = 0.0
print('End of learning')
Once you've trained and got a trained model, the next step is to actually "use" the trained model and infer it to see how accurate this trained model is.
It's almost an essential step.
# -----------------------------------------------------------------------------
#Verification: Calculate the correct answer rate for all verification image data
print('----------Calculate the correct answer rate for all verification image data----------')
#Total number of data (number of measurement targets)
total = 0
#Correct answer counter
count_when_correct = 0
#
for data in test_data_loader:
#Extract data from the validation data loader and unpack it
test_data, teacher_labels = data
#After converting the test data, pass it to the model and have it judged
results = model(Variable(test_data))
#Get the forecast
print(torch.max(results, 1))
#result:
# torch.return_types.max(
# values=tensor([1.2185, 5.8557, 2.8262, 4.7874], grad_fn=<MaxBackward0>),
# indices=tensor([2, 8, 8, 8]))
# torch.max(tensor, axis)
# values indices
# ↓ ↓
# _ predicted
_, predicted = torch.max(results.data, 1)
#Extract the maximum value (most certain label) of the inference result array one by one.
#If you don't use it, make it an underscore. (disposable)
#Here, axis=Since it is 1, it means to retrieve the maximum value for each row.
print('_', _)
#Result: Contains the maximum value of each
# tensor([1.6123, 5.6203, 3.0886, 3.8317], grad_fn=<MaxBackward0>)
print('predicted', predicted)
#Result: "What is the maximum value?"(index location)Is included
# tensor([3, 9, 1, 0])
#
# print('teacher_labels',teacher_labels)
#result:
# teacher_labels
# tensor([3, 5, 3, 8])
# teacher_labels
# tensor([3, 5, 1, 7])
# ...
# ...
#
# print('teacher_labels.size(0)',teacher_labels.size(0))
# teacher_labels.size(0) 4
total += teacher_labels.size(0)
count_when_correct += (predicted == teacher_labels).sum()
print('count_when_correct:%d' % (count_when_correct))
print('total:%d' % (total))
print('Correct answer rate:%d / %d = %f' % (count_when_correct, total,
int(count_when_correct) / int(total)))
How is it? Do you have an image of image recognition using a neural network with PyTorch to some extent? Almost everything has become "read comments", but if there are many likes, I will post an article like this. I look forward to working with you.
https://github.com/kawashimaken/salon/blob/master/pytorch/mnist.py
In addition, we may update comments, so the latest code will be managed on GitHub, so if you like, please follow, star, or bookmark. m (.) m
Recommended Posts