Python's open source machine learning library.
PyTorch defines a class called ** Tensor ** (torch.Tensor
), which is used for saving and computing multidimensional arrays. It's similar to Numpy's array Array, but it also allows CUDA-enabled Nvidia to operate on GPUs.
[Source]-> PyTorch-Wikipedia
Machine learning libraries are roughly divided into two types: ** Define by Run ** and ** Define and Run **.
Define by Run Define the network while running. Since the network can be changed dynamically, flexible design is possible. For example, it is possible to switch networks according to the size of data and change the design for each iteration. Well-known libraries include PyTorch and Chainer.
Define and Run Define the network first and then execute. You can easily configure a network just by combining parts like a Lego block. Concise and easy to understand. Well-known libraries include Keras and Tensorflow.
PyTorch belongs to the Define by Run machine learning library that builds a network while executing. Because of these differences, it seems best to use these libraries in the right place. Is Keras of Define and Run simple in normal data analysis work, and PyTorch of Define by Run superior in research and difficult tasks that require detailed design?
[Reference source]-> "Introduction to PyTorch" How to use & what is the difference from Tensorflow, Keras, etc.? --Proclassist
The biggest difference between PyTorch and Chainer is that PyTorch is widely used in overseas communities, while Chainer is mainly in Japan. This is because Chainer was developed by a company called Preferred Networks (PFN) from Japan. However, in December 2019, PFN announced that it would end the major update of Chainer and move to research and development of PyTorch, which changed the relationship between the two libraries. For details, see here. Therefore, if you want to use the Define by Run machine learning library in the future, it is safe to select PyTorch.
Check the PyTorch specifications while quoting the official PyTorch tutorial.
What is PyTorch? -- PyTorch Tutorials 1.4.0 documentation
The Tensor
used in PyTorch is similar to Numpy's ndarray
, but the Tensor
can be calculated using the GPU for faster computation.
The following is a summary of how to use PyTorch's Tensor
in comparison with Numpy.
import torch
import numpy as np
An array with all zeros.
# Tensor
x_t = torch.zeros(2, 3)
# Numpy
x_n = np.zeros((2,3))
print('Tensor:\n',x_t,'\n')
print('Numpy:\n',x_n)
# ---Output---
#Tensor:
# tensor([[0., 0., 0.],
# [0., 0., 0.]])
#Numpy:
# [[0. 0. 0.]
# [0. 0. 0.]]
An array with all elements 1.
# Tensor
x_t = torch.ones(2,3)
# Numpy
x_n = np.ones((2,3))
print('Tensor:\n',x_t,'\n')
print('Numpy:\n',x_n)
# ---Output---
#Tensor:
# tensor([[1., 1., 1.],
# [1., 1., 1.]])
#Numpy:
# [[1. 1. 1.]
# [1. 1. 1.]]
An array that specifies the values of the elements.
# Tensor
x_t = torch.tensor([[5,3],[10,6]])
# Numpy
x_n = np.array([[5,3],[10,6]])
print('Tensor:\n',x_t,'\n')
print('Numpy:\n',x_n)
# ---Output---
#Tensor:
# tensor([[ 5, 3],
# [10, 6]])
#Numpy:
# [[ 5 3]
# [10 6]]
An array in which element values are specified by random numbers.
# Tensor
x_t = torch.rand(2,3)
# Numpy
x_n = np.random.rand(2,3)
print('Tensor:\n',x_t,'\n',x12_t,'\n')
print('Numpy:\n',x_n,'\n',x12_n)
# ---Output---
#Tensor:
# tensor([[0.5266, 0.1276, 0.6704],
# [0.0412, 0.5800, 0.0312]])
# tensor(0.3370)
#Numpy:
# [[0.08877971 0.51718009 0.99738679]
# [0.35288525 0.68630145 0.73313903]]
# 0.1799177580940461
Access to each element of the array can be done like x [0,1]
(this will get the elements in the 1st row and 2nd column of the array x
).
# Tensor
x12_t = x_t[0,1]
# Numpy
x12_n = x_n[0,1]
print('Tensor:\n',x12_t,'\n')
print('Numpy:\n',x12_n)
# ---Output---
#Tensor:
# tensor(0.1276)
#Numpy:
# 0.5171800941956144
It should be noted here that Numpy gets a numerical value when getting an array element, but PyTorch gets a Tensor instead of a numerical value. Therefore, PyTorch cannot treat the elements of the array extracted in this way as they are as a scalar quantity. If you want to retrieve numbers like Numpy, you need to execute Tensor.item ()
.
x12_value = x12_t.item()
print(x12_t)
print(x12_value)
# ---Output---
# tensor(0.1276)
# 0.12760692834854126
With PyTorch, you can perform four arithmetic operations with the same feeling as Numpy.
# Tensor
x_t = torch.Tensor([1,2,3])
y_t = torch.Tensor([2,2,2])
add_t = x_t + y_t
sub_t = x_t - y_t
mul_t = x_t * y_t
div_t = x_t / y_t
print('Tensor:\nAddition:\n',add_t,'\nSubtraction:\n',sub_t,
'\nMultiplication:\n',mul_t,'\nDivision:\n',div_t,'\n')
# Numpy
x_n = np.array([1,2,3])
y_n = np.array([2,2,2])
add_n = x_n + y_n
sub_n = x_n - y_n
mul_n = x_n * y_n
div_n = x_n / y_n
print('Numpy:\nAddition:\n',add_n,'\nSubtraction:\n',sub_n,
'\nMultiplication:\n',mul_n,'\nDivision:\n',div_n)
# ---Output---
#Tensor:
#Addition:
# tensor([3., 4., 5.])
#Subtraction:
# tensor([-1., 0., 1.])
#Multiplication:
# tensor([2., 4., 6.])
#Division:
# tensor([0.5000, 1.0000, 1.5000])
#
#Numpy:
#Addition:
# [3 4 5]
#Subtraction:
# [-1 0 1]
#Multiplication:
# [2 4 6]
#Division:
# [0.5 1. 1.5]
Autograd: Automatic Differentiation -- PyTorch Tutorials 1.4.0 documentation
The shape information (number of rows, number of columns) of the array can be obtained by the shape
method. It behaves like Numpy.
# Tensor
x_t = torch.rand(4,3)
row_t = x_t.shape[0]
column_t = x_t.shape[1]
print('Tensor:\n','row: ',row_t,'column: ',column_t)
# Numpy
x_n = np.random.rand(4,3)
row_n = x_n.shape[0]
column_n = x_n.shape[1]
print('Numpy:\n','row: ',row_n,'column: ',column_n)
# ---Output---
#Tensor:
# row: 4 column: 3
#Numpy:
# row: 4 column: 3
If you want to change the shape of the array, you often use .view ()
in PyTorch and .reshape ()
in Numpy. However, you can use .reshape ()
for PyTorch's Tensor as well as Numpy.
# Tensor
x_t = torch.rand(4,3)
y_t = x_t.view(12)
z_t = x_t.view(2,-1)
print('Tensor:\n',x_t,'\n',y_t,'\n',z_t,'\n')
# Numpy
x_n = np.random.rand(4,3)
y_n = x_n.reshape(12)
z_n = x_n.reshape([2,-1])
print('Numpy:\n',x_n,'\n',y_n,'\n',z_n)
# ---Output---
#Tensor:
# tensor([[0.5357, 0.2716, 0.2651],
# [0.6570, 0.0844, 0.9729],
# [0.4436, 0.9271, 0.4013],
# [0.8725, 0.2952, 0.1330]])
# tensor([0.5357, 0.2716, 0.2651, 0.6570, 0.0844, 0.9729, 0.4436, 0.9271, 0.4013,
# 0.8725, 0.2952, 0.1330])
# tensor([[0.5357, 0.2716, 0.2651, 0.6570, 0.0844, 0.9729],
# [0.4436, 0.9271, 0.4013, 0.8725, 0.2952, 0.1330]])
#
#Numpy:
# [[0.02711389 0.24172801 0.01202486]
# [0.59552453 0.49906154 0.81377212]
# [0.24744639 0.58570244 0.26464142]
# [0.14519645 0.03607043 0.46616757]]
# [0.02711389 0.24172801 0.01202486 0.59552453 0.49906154 0.81377212
# 0.24744639 0.58570244 0.26464142 0.14519645 0.03607043 0.46616757]
# [[0.02711389 0.24172801 0.01202486 0.59552453 0.49906154 0.81377212]
# [0.24744639 0.58570244 0.26464142 0.14519645 0.03607043 0.46616757]]
When using .reshape ()
.
# Tensor
x_t = torch.rand(4,3)
y_t = x_t.reshape(2,-1)
#y_t = torch.reshape(x_t,[2,-1]) <-- Also works
print('Tensor:\n',y_t,'\n')
# Numpy
x_n = np.random.rand(4,3)
y_n = x_n.reshape(2,-1)
#y_n = np.reshape(x_n,[2,-1]) <-- Also works
print('Numpy:\n',y_n)
# ---Output---
#Tensor:
#tensor([[0.0617, 0.4898, 0.4745, 0.8218, 0.3760, 0.1556],
# [0.3192, 0.5886, 0.8385, 0.5321, 0.9758, 0.8254]])
#
#Numpy:
#[[0.60080911 0.55132561 0.75930606 0.03275005 0.83148483 0.48780054]
# [0.10971541 0.02317271 0.22571149 0.95286975 0.93045979 0.82358474]]
Array transpose is done with .transpose ()
or .t ()
in PyTorch and with .transpose ()
or .T
in Numpy.
# Tensor
x_t = torch.rand(3,2)
xt_t = x_t.transpose(0,1)
#xt_t = torch.transpose(x_t,0,1)
#xt_t = x_t.t()
print('Tensor:\n',x_t,'\n',xt_t)
# Numpy
x_n = np.random.rand(3,2)
xt_n = x_n.transpose()
#xt_n = np.transpose(x_n)
#xt_n = x_n.T
print('Numpy:\n',x_n,'\n',xt_n)
# ---Output---
#Tensor:
# tensor([[0.8743, 0.8418],
# [0.6551, 0.2240],
# [0.9447, 0.2824]])
# tensor([[0.8743, 0.6551, 0.9447],
# [0.8418, 0.2240, 0.2824]])
#Numpy:
# [[0.80380702 0.81511741]
# [0.29398279 0.78025418]
# [0.19421487 0.43054298]]
# [[0.80380702 0.29398279 0.19421487]
# [0.81511741 0.78025418 0.43054298]]
Tensor
--> ndarray
To convert from Tensor
to ndarray
, use Tensor.numpy ()
.
The converted ndarray
is not affected by the change of the reference source Tensor
. (ndarrya
is a copy of Tensor
.) If you want to link, you need to use in-place operation (add _
to the end of each function. For example, ʻadd_ ()`.) is there.
a = torch.ones(5)
b = a.numpy()
a = a + 1
print('a = ',a)
print('b = ',b)
# ---Output---
# a = tensor([2., 2., 2., 2., 2.])
# b = [1. 1. 1. 1. 1.]
a = torch.ones(5)
b = a.numpy()
a.add_(1)
#torch.add(a,1,out=a) <-- Same operation
print('a = ',a)
print('b = ',b)
# ---Output---
# a = tensor([2., 2., 2., 2., 2.])
# b = [2. 2. 2. 2. 2.]
Tensor
<-- ndarray
To convert from ndarray
to Tensor
, use torch.from_numpy (ndarray)
.
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print('a = ',a)
print('b = ',b)
# ---Output---
# a = [2. 2. 2. 2. 2.]
# b = tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
CUDA Tensor
Tensor
can move the calculation area by using the.to ()
method.
This allows you to move Tensor
from CPU memory to GPU memory and perform calculations.
x = torch.rand(2,3)
if torch.cuda.is_available():
device = torch.device("cuda") # a CUDA device object
y = torch.ones_like(x, device=device) # directly create a tensor on GPU
x = x.to(device) # or just use strings ``.to("cuda")``
z = x + y
print(z)
print(z.to("cpu", torch.double)) # ``.to`` can also change dtype together!
# ---Output---
#tensor([[1.1181, 1.1125, 1.3122],
# [1.1282, 1.5595, 1.4443]], device='cuda:0')
#tensor([[1.1181, 1.1125, 1.3122],
# [1.1282, 1.5595, 1.4443]], dtype=torch.float64)
By setting the requires_grad
attribute of torch.Tensor
to True
, all calculation history can be tracked. By calling the backward ()
method when the calculation is completed, all the derivatives are automatically executed. The derivative is stored in the grad
attribute.
If you want to stop tracking the calculation history, you can call the detach ()
method to separate it from the tracking of the calculation history.
Each Tensor
has a grad_fn
attribute. This attribute refers to the Function
class that creates the Tensor
. (Strictly speaking, the user-defined Tensor
does not have the grad_fn
attribute, and the Tensor
in the calculation result is given the grad_fn
attribute.)
x = torch.ones(2, 2, requires_grad=True)
print(x)
# ---Output---
#tensor([[1., 1.],
# [1., 1.]], requires_grad=True)
y = x + 2
print(y)
# ---Output---
#tensor([[3., 3.],
# [3., 3.]], grad_fn=<AddBackward0>)
print(x.grad_fn)
print(y.grad_fn)
# ---Output---
# None
# <AddBackward0 object at 0x7f2285d93940>]
z = y * y * 3
out = z.mean()
print(z)
print(out)
# ---Output---
#tensor([[27., 27.],
# [27., 27.]], grad_fn=<MulBackward0>)
#tensor(27., grad_fn=<MeanBackward0>)
print(z.grad_fn)
# ---Output---
#<MulBackward0 object at 0x7f2285d93ac8>
out.backward()
print(x.grad)
# ---Output---
#tensor([[4.5000, 4.5000],
# [4.5000, 4.5000]])
When I actually calculated the final result,
out = \frac{1}{4} \sum_{i} z_i \\
z_i = y_i \cdot y_i \cdot 3 = 3 \cdot (x_i+2)^2
Therefore,
\frac{\partial out}{\partial x_i} = \frac{1}{4} \cdot 3 \cdot 2 \cdot (x_i+2) = 4.5
It is confirmed that
The main points of this article are summarized below.
--PyTorch is a Define by Run machine learning library.
--Use an array called torch.Tensor
that enables high-speed calculation and automatic differentiation. This can be used (defined / operated) in almost the same way as Numpy's numpy.ndaray
, and can be easily converted to each other.
--By setting the requires_grad
attribute of torch.Tensor
to True
, the calculation history can be traced, and by calling thebackward ()
method at the end of the calculation, automatic differentiation is performed. This is very useful for updating parameters by the error back propagation method of neural networks.
Recommended Posts