In writing this article, I referred to Python Machine Learning Programming -GitHub. I'm really thankful to you.
The cells that make up nerves are stimulated and excited, and the stimulus is transmitted to other cells. Nerve cell.
The neuron is an image like the one below.
[Nikkei Cross Tech](https://www.google.com/url?sa=i&url=https%3A%2F%2Fxtech.nikkei.com%2Fdm%2Fatcl%2Ffeature%2F15%2F032300023%2F00003%2F&psig=AOvVaw3ekhKAPatWT524BOdiUTBX 1593296223820000 & source = images & cd = vfe & ved = 0CAMQjB1qFwoTCJDi3o_BoOoCFQAAAAAdAAAAABAQ)
There are about 200 billion neurons in your brain. And each neuron is connected as shown in the image below.
[Earth Seminar 36-3](https://www.google.com/url?sa=i&url=http%3A%2F%2Fblog.livedoor.jp%2Fnara_suimeishi%2Farchives%2F51595095.html&psig=AOvVaw2KqsANc_yt4xjFv8gllDcL&ust=159 vfe & ved = 0CAMQjB1qFwoTCMD-84vDoOoCFQAAAAAdAAAAABAD)
The joints are called synapses.
A simple diagram of a neuron is shown below.
There are many input values up to $ x_1 ... x_m $, but the output value is always either "fire" or "do not fire". This time, the output value when "ignites" is 1, and the output value when "does not ignite" is -1. (The output value is arbitrary) By the way, $ w $ is an acronym for weight, which determines the importance of the input value. $ y $ represents the output value.
When this is mathematically expressed, it becomes like this.
z = x_1w_1 + ... + x_mw_m = \sum_{x=1}^{m} x_iw_i
The new $ z $ here is called total input, or net input in English.
And the $ \ theta $ shown in the image above is called the threshold value, and when it is $ z \ geq \ theta $, "fire" 1 is output. On the other hand, if $ z <\ theta $, -1 is output because it does not ignite.
f(z) = \left\{
\begin{array}{ll}
1 & (z \geq 0) \\
0 & (z \lt 0)
\end{array}
\right.
At this time, $ f (z) $ becomes a ** decision function ** that outputs "1" when it becomes 0 or more and "0" when it becomes 0 or less.
As you can see from this graph, when z becomes 0 or more, it fires and 1 is output.
In this way, the function that suddenly fires at a certain timing is called the ** Heaviside step function **.
So far, we have talked about simple neurons called formal neurons. Next, I will explain an algorithm called perceptron.
Introduced in 1958, the Simple Perceptron runs on a very simple algorithm.
1.Initialize weights to 0 or any number
2.Calculate the output value for each input value and update the weight
That's it.
The formula for updating the weight is as follows.
W_j := W_j + \Delta W_j
Each weight on the left, $ W_j $, is updated with $ \ Delta W_j $.
$ \ Delta W_j $ is
\Delta W_j = \eta\:(\:y^{(i)}-\hat{y}^{(i)}\:)\: x_{j}^{(i)}
Can be written as.
$ y ^ {(i)} $ represents the true class label (correct classification) and $ \ hat {y} ^ {(i)} $ represents the output value (classification result). And $ \ eta $ shows the learning rate and $ x_ {j} ^ {(i)} $ shows the input value.
Now that you understand the generated formula, let's try a concrete number.
For example, consider a pattern that has been misclassified.
\Delta W_j = \eta\:(\:1-(-1)\:)\: x_{j}^{(i)} = \eta\:(2)\:x_{j}^{(i)}
If $ \ eta $ is 1, $ x_ {j} ^ {(i)} $ is 0.5, the weight is updated to 1.
This makes the total number of inputs ($ x_ {j} ^ {(i)} W_j $) a larger positive number and less likely to be mistaken.
And it is ** Simple Perceptron ** that keeps doing this all the time until there are no mistakes.
Simple perceptrons cannot separate two classes that are not linearly separable.
Also, XOR operation is not possible because linear separation is not possible. (What is XOR operation?) Consider the input values $ x_1 $, $ x_2 $.
0 | 0 | 0 |
0 | 1 | 1 |
1 | 1 | 0 |
1 | 0 | 1 |
At this time, plotting each will look like this.
Machine learning that even high school graduates can understand
As you can see, this is not linearly separable.
In order to solve the XOR operation, it is necessary to implement a multi-layer perceptron.
Development environment: Chrome 83 Google Colab Mac OS High Sierra
import numpy as np
class Perceptron(object):
def __init__(self, eta=0.01, n_iter=50, random_state=1):
self.eta = eta
self.n_iter = n_iter
self.random_state = random_state
def fit(self, X, y):
rgen = np.random.RandomState(self.random_state)
self.w_ = rgen.normal(loc=0.0, scale=0.01, size=1 + X.shape[1])
self.errors_ = []
for _ in range(self.n_iter):
errors = 0
for xi, target in zip(X, y):
update = self.eta * (target - self.predict(xi))
self.w_[1:] += update * xi
self.w_[0] += update
errors += int(update != 0.0)
self.errors_.append(errors)
return self
def net_input(self, X):
return np.dot(X, self.w_[1:]) + self.w_[0]
def predict(self, X):
return np.where(self.net_input(X) >= 0.0, 1, -1)
Original code: GitHub
#Data reading and confirmation, X,specification of y
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', header=None)
df.head()
y = df.iloc[0:100, 4].values
y = np.where(y == 'Iris-setosa', -1, 1)
X = df.iloc[0:100, [0, 2]].values
Original code: GitHub
from matplotlib.colors import ListedColormap
def plot_decision_regions(X, y, classifier, resolution=0.02):
# setup marker generator and color map
markers = ('s', 'x', 'o', '^', 'v')
colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
cmap = ListedColormap(colors[:len(np.unique(y))])
# plot the decision surface
x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
np.arange(x2_min, x2_max, resolution))
Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
Z = Z.reshape(xx1.shape)
plt.contourf(xx1, xx2, Z, alpha=0.3, cmap=cmap)
plt.xlim(xx1.min(), xx1.max())
plt.ylim(xx2.min(), xx2.max())
# plot class samples
for idx, cl in enumerate(np.unique(y)):
plt.scatter(x=X[y == cl, 0],
y=X[y == cl, 1],
alpha=0.8,
c=colors[idx],
marker=markers[idx],
label=cl,
edgecolor='black')
Original code: GitHub
plot_decision_regions(X, y, classifier=ppn)
plt.xlabel('sepal length [cm]')
plt.ylabel('petal length [cm]')
plt.legend(loc='upper left')
# plt.savefig('images/02_08.png', dpi=300)
plt.show()
Original code: GitHub
Next, I would like to try the implementation of Multilayer Perceptron.
Recommended Posts