Machine learning algorithm (simple perceptron)

Introduction

Step-by-step on the theory, implementation in python, and analysis using scikit-learn about the algorithm previously taken up in "Classification of Machine Learning" I will study with. I'm writing it for personal learning, so I'd like you to overlook any mistakes.

From this time on, I will start on the classification problem. First of all, from the basic perceptron.

The following sites were referred to this time. Thank you very much.

What is 2 classification?

Two-class classification refers to outputting "1" or "0" (or "1" or "-1") for an input. Instead of "may break down with a 60% probability", put black and white to see if it breaks down or not. There are various types of two-class classification, and ** Perceptron ** is the most basic classifier.

Overview of Perceptron

The perceptron is a nerve cell-inspired model that adds weights to a large number of inputs and outputs 1 when a certain threshold is exceeded. It is that picture that you often see when illustrated.

perceptron.png

Mathematical expression

n inputs $ \ boldsymbol {x} = (x_0, x_1, \ cdots, x_ {n}) $, weights $ \ boldsymbol {w} = (w_0, w_1, \ cdots, w_ {n}) $ And when you add them all together,

w_0x_0+w_1x_1+\cdots+w_{n}x_{n} \\\
=\sum_{i=0}^{n}w_ix_i \\\
= \boldsymbol{w}^T\boldsymbol{x}

It is expressed as. T is the transposed matrix. And if this value is positive, it outputs 1, and if it is negative, it outputs -1. A function that indicates such a value of -1 or 1 is called a step function.

The initial value irrelevant to the input is called ** bias term **, but if the bias term is $ w_0 $ and $ x_0 = 1 $, the above formula can be used as it is.

Let's write up to here in python

Since python can calculate the product of matrices with "@", if the input input to the perceptron and the output are the output

import numpy as np

w = np.array([1.,-2.,3.,-4.])
x = np.array([1.,2.,3.,4.])

input = w.T @ x
output = 1 if input>=0 else -1

It's easy.

Learning Perceptron

Perceptron is so-called "supervised learning". For the given $ \ boldsymbol {x} $, if there is a correct label $ \ boldsymbol {t} = (t_0, t_1, \ cdots, t_n) $, then $ \ boldsymbol {w} ^ T \ boldsymbol {x You need to find $ \ boldsymbol {w} $ such that} $ returns the correct label correctly.

This needs to be learned using the teacher data as in the case of regression. For the perceptron, the same approach of determining the loss function and updating the parameter $ \ boldsymbol {w} $ to minimize the loss is effective.

Perceptron loss function

So what kind of loss function should we set? The idea is that if the answer is correct, there is no loss, and if the answer is incorrect, the loss is given according to the distance from the boundary based on the boundary that classifies the two classes.

The hinge function is often used to meet such demands. It seems that the perceptron of scikit-learn also uses the hinge function. For the hinge function,

As you can see here, in a function that increases from a certain value, if you say $$ h (x) , you can write $ h (x) = \ max (0, x-a) $$. The hinge function is also used in SVM (Support Vector Machine). SVM is important, so I will come back soon.

For the loss function, if the correct label $ t_n $ for each element and the predicted value $ step (w_nx_n) $ are the same, $ t_nw_nx_n $ indicates a positive value, and if they are different, it is a negative value. The smaller the loss function, the better, so if the loss function is $ L , then $ L = \ sum_ {i = 0} ^ {n} \ max (0, -t_nw_nx_n) $$. Find $ w_n $, which minimizes the loss function, using gradient descent.

Partial differentiation of $ L $ with respect to $ w_n $

\frac{\partial L}{\partial w_n}=-t_nx_n

So the recurrence formula that updates $ w_n $ is

w_{i+1}=w_{i}+\eta t_nx_n

Can be written. In addition, $ \ eta $ is the learning rate.

Perceptron python implementation

I will actually implement it in python. The data used is the familiar scikit-learn to iris classification. See below for a detailed description of the dataset.

First of all, since it is a two-class classification, we will specialize in that. It doesn't matter what data you use, but it's arbitrary and prejudiced, and the labels are "versicolor" and "virginica". "Sepal length (cm)" and "petal width (cm)" were selected for the features.

First, visualize the data.

mport numpy as np
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

from sklearn.datasets import load_iris

iris = load_iris()

df_iris = pd.DataFrame(iris.data, columns=iris.feature_names)
df_iris['target'] = iris.target_names[iris.target]

fig, ax = plt.subplots()

x1 = df_iris[df_iris['target']=='versicolor'].iloc[:,3].values
y1 = df_iris[df_iris['target']=='versicolor'].iloc[:,0].values

x2 = df_iris[df_iris['target']=='virginica'].iloc[:,3].values
y2 = df_iris[df_iris['target']=='virginica'].iloc[:,0].values

ax.scatter(x1, y1, color='red', marker='o', label='versicolor')
ax.scatter(x2, y2, color='blue', marker='s', label='virginica')

ax.set_xlabel("petal width (cm)")
ax.set_ylabel("sepal length (cm)")
ax.legend()

plt.plot()
perceptron_1.png

It seems that it can be classified somehow (it is also said that such data was selected).

Implementation of perceptron class

Implement the Perceptron class. The bias term is intentionally added.

class Perceptron:
  def __init__(self, eta=0.1, n_iter=1000):
    self.eta=eta
    self.n_iter=n_iter
    self.w = np.array([])

  def fit(self, x, y):
    self.w = np.ones(len(x[0])+1)

    x = np.hstack([np.ones((len(x),1)), x])

    for _ in range(self.n_iter):
      for i in range(len(x)):
        loss = np.max([0, -y[i] * self.w.T @ x[i]])
        if (loss!=0):
          self.w += self.eta * y[i] * x[i]

  def predict(self, x):
    x = np.hstack([1., x])
    return 1 if self.w.T @ x>=0 else -1

  @property
  def w_(self):
    return self.w

The hinge loss function is calculated for each data, and if the answer is incorrect, the weight is updated by the steepest gradient method. The calculation is stopped when the specified number of updates is reached, but this may be stopped when the error falls below a certain value.

Actually classify

After putting the data in the previous class and training it, let's draw a boundary.

df = df_iris[df_iris['target']!='setosa']
df = df.drop(df.columns[[1,2]], axis=1)
df['target'] = df['target'].map({'versicolor':1, 'virginica':-1})

x = df.iloc[:,0:2].values
y = df['target'].values

model = Perceptron()
model.fit(x, y)

#Drawing a graph
fig, ax = plt.subplots()

x1 = df_iris[df_iris['target']=='versicolor'].iloc[:,3].values
y1 = df_iris[df_iris['target']=='versicolor'].iloc[:,0].values

x2 = df_iris[df_iris['target']=='virginica'].iloc[:,3].values
y2 = df_iris[df_iris['target']=='virginica'].iloc[:,0].values

ax.scatter(x1, y1, color='red', marker='o', label='versicolor')
ax.scatter(x2, y2, color='blue', marker='s', label='virginica')

ax.set_xlabel("petal width (cm)")
ax.set_ylabel("sepal length (cm)")

#Draw classification boundaries
w = model.w_
x_fig = np.linspace(1.,2.5,100)
y_fig = [-w[2]/w[1]*xi-w[0]/w[1] for xi in x_fig]
ax.plot(x_fig, y_fig)
ax.set_ylim(4.8,8.2)

ax.legend()

plt.show()
perceptron_2.png

It seems that virginica can be classified correctly, but visicolor cannot be classified. Is it such a thing?

Try it with scikit-learn

df = df_iris[df_iris['target']!='setosa']
df = df.drop(df.columns[[1,2]], axis=1)
df['target'] = df['target'].map({'versicolor':1, 'virginica':-1})

x = df.iloc[:,0:2].values
y = df['target'].values

from sklearn.linear_model import Perceptron
model = Perceptron(max_iter=40, eta0=0.1)
model.fit(x,y)

#The graph part is omitted
perceptron_3.png

Well, the versicolor can be classified in the opposite way. The loss function may be a little different, but I haven't verified it.

Summary

I thought about the perceptron, which is the basis of the classifier. Since deep learning is a model that combines a large number of perceptrons, understanding of perceptrons will become more important.

Recommended Posts

Machine learning algorithm (simple perceptron)
Machine learning algorithm (simple regression analysis)
Machine Learning_Nonlinearize Simple Perceptron
Machine learning algorithm (support vector machine)
Machine learning algorithm (logistic regression)
Machine learning
An introduction to machine learning from a simple perceptron
<Course> Machine Learning Chapter 6: Algorithm 2 (k-means)
Machine learning algorithm (support vector machine application)
Machine learning algorithm (multiple regression analysis)
Machine learning algorithm (gradient descent method)
Machine learning algorithm (generalization of linear regression)
Machine learning with python (2) Simple regression analysis
Machine learning algorithm (implementation of multi-class classification)
Machine learning algorithm classification and implementation summary
[Memo] Machine learning
Machine learning classification
Machine learning algorithm (linear regression summary & regularization)
Machine Learning sample
[Python / Machine Learning] Why Deep Learning # 1 Perceptron Neural Network
A story about simple machine learning using TensorFlow
Gaussian mixed model EM algorithm [statistical machine learning]
Machine learning tutorial summary
About machine learning overfitting
Machine learning ⑤ AdaBoost Summary
[Super Introduction] Machine learning using Python-From environment construction to implementation of simple perceptron-
Machine Learning: Supervised --AdaBoost
Machine learning logistic regression
Machine learning support vector machine
Studying Machine Learning ~ matplotlib ~
Machine learning linear regression
Machine learning course memo
Machine learning library dlib
Machine learning (TensorFlow) + Lotto 6
Somehow learn machine learning
Machine learning library Shogun
Machine learning rabbit challenge
Introduction to machine learning
Machine Learning: k-Nearest Neighbors
What is machine learning?
Non-information graduate student studied machine learning from scratch # 1: Perceptron
Python Scikit-learn Linear Regression Analysis Nonlinear Simple Regression Analysis Machine Learning
Talk about improving machine learning algorithm bottlenecks with Cython
Machine learning learned with Pokemon
Data set for machine learning
Japanese preprocessing for machine learning
Python Machine Learning Programming Chapter 2 Classification Problems-Machine Learning Algorithm Training Summary
Machine learning in Delemas (practice)
An introduction to machine learning
Machine learning / classification related techniques
Machine Learning: Supervised --Linear Regression
Machine learning beginners tried RBM
[Machine learning] Understanding random forest
Machine Learning Study Resource Notepad
Machine learning ② Naive Bayes Summary
Understand machine learning ~ ridge regression ~.
Machine learning article summary (self-authored)
Machine Learning: Supervised --Random Forest
Machine learning Minesweeper with PyTorch
Machine learning environment construction macbook 2021
Build a machine learning environment