Machine learning algorithm (implementation of multi-class classification)

Introduction

Step-by-step on the theory, implementation in python, and analysis using scikit-learn about the algorithm previously taken up in "Classification of Machine Learning" I will study with. I'm writing it for personal learning, so I'd like you to overlook any mistakes.

Last time, extended 2 class classification to multi class classification. This time I will actually implement it in Python.

I referred to the following sites. Thank you very much.

Implementation policy

I would like to extend Logistic Regression implemented before to multiple classes. The method is

I will try it with.

Data used for classification

Iris data is used for classification. It uses 4 features (sepal_length, sepal_width, petal_length, petal_width) and classifies them into 3 classes (setosa, versicolor, virginica).

Below, we will implement the classification using sepal_length and sepal_width for clarity.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

%matplotlib inline

from sklearn.datasets import load_iris

sns.set()
iris = sns.load_dataset("iris")
ax = sns.scatterplot(x=iris.sepal_length, y=iris.sepal_width,
                     hue=iris.species, style=iris.species)
qiita_classifier_multi_1.png

One-vs-Rest One-vs-Rest builds a two-class classifier for each label class to learn, and finally uses the most plausible value. Since logistic regression outputs the probability value, the classification of the classifier with the highest probability is adopted.

Use the LogisticRegression class, which is a slightly modified version of the logistic regression code we used last time. I made the predict_proba method because it determines which value to use with probability.

from scipy import optimize

class LogisticRegression:
  def __init__(self):
    self.w = None

  def sigmoid(self, a):
    return 1.0 / (1 + np.exp(-a))

  def predict_proba(self, x):
    x = np.hstack([1, x])
    return self.sigmoid(self.w.T @ x)
  
  def predict(self, x):
    return 1 if self.predict_proba(x)>=0.5 else -1

  def cross_entropy_loss(self, w, *args):
    def safe_log(x, minval=0.0000000001):
      return np.log(x.clip(min=minval))
    t, x = args
    loss = 0
    for i in range(len(t)):
      ti = 1 if t[i] > 0 else 0
      h = self.sigmoid(w.T @ x[i])
      loss += -ti*safe_log(h) - (1-ti)*safe_log(1-h)

    return loss/len(t)

  def grad_cross_entropy_loss(self, w, *args):
    t, x = args
    grad = np.zeros_like(w)
    for i in range(len(t)):
      ti = 1 if t[i] > 0 else 0
      h = self.sigmoid(w.T @ x[i])
      grad += (h - ti) * x[i]

    return grad/len(t)

  def fit(self, x, y):
    w0 = np.ones(len(x[0])+1)
    x = np.hstack([np.ones((len(x),1)), x])

    self.w = optimize.fmin_cg(self.cross_entropy_loss, w0, fprime=self.grad_cross_entropy_loss, args=(y, x))

  @property
  def w_(self):
    return self.w

Implement the One-vs-Rest class. I also implemented the ʻaccuracy_score` method to calculate how correct the answer is, as I will use it later for algorithm comparison.

from sklearn.metrics import accuracy_score

class OneVsRest:
  def __init__(self, classifier, labels):
    self.classifier = classifier
    self.labels = labels
    self.classifiers = [classifier() for _ in range(len(self.labels))]

  def fit(self, x, y):
    y = np.array(y)
    for i in range(len(self.labels)):
      y_ = np.where(y==self.labels[i], 1, 0)
      self.classifiers[i].fit(x, y_)

  def predict(self, x):
    probas = [self.classifiers[i].predict_proba(x) for i in range(len(self.labels))]
    return np.argmax(probas)

  def accuracy_score(self, x, y):
    pred = [self.labels[self.predict(i)] for i in x]
    acc = accuracy_score(y, pred)
    return acc

Actually classify using the previous data.

model = OneVsRest(LogisticRegression, np.unique(iris.species))
x = iris[['sepal_length', 'sepal_width']].values
y = iris.species
model.fit(x, y)
print("accuracy_score: {}".format(model.accuracy_score(x,y)))

accuracy_score: 0.8066666666666666

The correct answer rate of 81% is not very good. Let's visualize how it was classified. Use matplotlib's contourf method for visualization. Colors according to which values on the grid points are classified.

from matplotlib.colors import ListedColormap

x_min = iris.sepal_length.min()
x_max = iris.sepal_length.max()
y_min = iris.sepal_width.min()
y_max = iris.sepal_width.max()

x = np.linspace(x_min, x_max, 100)
y = np.linspace(y_min, y_max, 100)

data = []
for i in range(len(y)):
  data.append([model.predict([x[j], y[i]]) for j in range(len(x))])

xx, yy = np.meshgrid(x, y)

cmap = ListedColormap(('blue', 'orange', 'green'))
plt.contourf(xx, yy, data, alpha=0.25, cmap=cmap)
ax = sns.scatterplot(x=iris.sepal_length, y=iris.sepal_width,
                     hue=iris.species, style=iris.species)
plt.show()
qiita_classifier_multi_2.png

As you can see, setosa is properly classified, but the remaining two classes are mixed, so it seems that the correct answer rate is a little low. For the time being, it will be like this.

Multiclass softmax

Implements the LogisticRegressionMulti class for softmax classification in logistic regression.

The cross-entropy error was used as the error function for evaluation, and the parameters were calculated using the steepest gradient descent method. I made it quite properly, I'm sorry

from sklearn.metrics import accuracy_score

class LogisticRegressionMulti:
  def __init__(self, labels, n_iter=1000, eta=0.01):
    self.w = None
    self.labels = labels
    self.n_iter = n_iter
    self.eta = eta
    self.loss = np.array([])

  def softmax(self, a):
    if a.ndim==1:
      return np.exp(a)/np.sum(np.exp(a))
    else:
      return np.exp(a)/np.sum(np.exp(a), axis=1)[:, np.newaxis]

  def cross_entropy_loss(self, w, *args):
    x, y = args
    def safe_log(x, minval=0.0000000001):
      return np.log(x.clip(min=minval))
    
    p = self.softmax(x @ w)
    loss = -np.sum(y*safe_log(p))

    return loss/len(x)

  def grad_cross_entropy_loss(self, w, *args):
    x, y = args

    p = self.softmax(x @ w)
    grad = -(x.T @ (y-p))

    return grad/len(x)

  def fit(self, x, y):
    self.w = np.ones((len(x[0])+1, len(self.labels)))
    x = np.hstack([np.ones((len(x),1)), x])

    for i in range(self.n_iter):
      self.loss = np.append(self.loss, self.cross_entropy_loss(self.w, x, y))
      grad = self.grad_cross_entropy_loss(self.w, x, y)
      self.w -= self.eta * grad
  
  def predict(self, x):
    x = np.hstack([1, x])
    return np.argmax(self.softmax(x @ self.w))

  def accuracy_score(self, x, y):
    pred = [self.predict(i) for i in x]
    y_ = np.argmax(y, axis=1)

    acc = accuracy_score(y_, pred)
    return acc

  @property
  def loss_(self):
    return self.loss

Input to LogisticRegressionMulti uses a One-Hot-Encoded label. This is easy with Pandas' get_dummies`. (I thought after making it, but I should have used get_dummies in the class)

model = LogisticRegressionMulti(np.unique(iris.species), n_iter=10000, eta=0.1)
x = iris[['sepal_length', 'sepal_width']].values
y = pd.get_dummies(iris['species']).values
model.fit(x, y)
print("accuracy_score: {}".format(model.accuracy_score(x, y)))

accuracy_score: 0.8266666666666667

The correct answer rate is about 83%. Looking at the history of the error, it seems that it has converged, so it seems like this.

qiita_classifier_multi_3.png

Also, let's color how it is classified in the same way as before.

qiita_classifier_multi_4.png

Compare with scikit-learn logistic regression

Finally, using all the features, we will compare the classifier we created this time with the LogisticRegression class of scikit-learn.

Method accuracy_score
OneVsRest 0.98
LogisticRegressionMulti 0.98
sklearn LogisticRegression 0.973

Even with this implementation, it seems that you can get a good score if it is about the classification of irises.

Summary

Implemented multi-class classification using logistic regression. I feel that other classifiers can be used in a similar way. Especially in neural networks, multi-class softmax is a popular method, so I thought it would be useful to understand the theoretical part later.

Recommended Posts

Machine learning algorithm (implementation of multi-class classification)
Machine learning algorithm classification and implementation summary
Machine learning classification
Machine learning algorithm (generalization of linear regression)
Python & Machine Learning Study Memo ⑤: Classification of irises
Machine learning algorithms (from two-class classification to multi-class classification)
Classification of guitar images by machine learning Part 2
Machine learning / classification related techniques
Basics of Machine Learning (Notes)
About testing in the implementation of machine learning models
Supervised learning 1 Basics of supervised learning (classification)
Machine learning algorithm (simple perceptron)
Importance of machine learning datasets
Supervised machine learning (classification / regression)
Machine learning algorithm (support vector machine)
Deep reinforcement learning 2 Implementation of reinforcement learning
Machine learning algorithm (logistic regression)
Python Machine Learning Programming Chapter 2 Classification Problems-Machine Learning Algorithm Training Summary
Try to evaluate the performance of machine learning / classification model
<Course> Machine Learning Chapter 6: Algorithm 2 (k-means)
Explanation and implementation of ESIM algorithm
Significance of machine learning and mini-batch learning
Qiskit: Implementation of quantum Boltzmann machine
Machine learning algorithm (support vector machine application)
Machine learning with python (1) Overall classification
Machine learning algorithm (multiple regression analysis)
Machine learning ③ Summary of decision tree
Machine learning algorithm (simple regression analysis)
Classification and regression in machine learning
Machine learning algorithm (gradient descent method)
Implementation of Dijkstra's algorithm with python
Machine learning
Multi-class, multi-label classification of images with pytorch
Deep learning learned by implementation 2 (image classification)
Othello-From the tic-tac-toe of "Implementation Deep Learning" (3)
[Machine learning] LDA topic classification using scikit-learn
Implementation of 3-layer neural network (no learning)
2020 Recommended 20 selections of introductory machine learning books
Explanation and implementation of Decomposable Attention algorithm
[Reinforcement learning] Easy high-speed implementation of Ape-X!
[Machine learning] List of frequently used packages
Machine learning algorithm (linear regression summary & regularization)
Judgment of igneous rock by machine learning ②
Non-recursive implementation of extended Euclidean algorithm (Python)
EV3 x Pyrhon Machine Learning Part 3 Classification
Othello-From the tic-tac-toe of "Implementation Deep Learning" (2)
Machine learning memo of a fledgling engineer Part 1
Gaussian mixed model EM algorithm [statistical machine learning]
Beginning of machine learning (recommended teaching materials / information)
Machine learning of sports-Analysis of J-League as an example-②
[Super Introduction] Machine learning using Python-From environment construction to implementation of simple perceptron-
Numerai Tournament-Fusion of Traditional Quants and Machine Learning-
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
Python & Machine Learning Study Memo ②: Introduction of Library
SVM (multi-class classification)
Dictionary learning algorithm
List of links that machine learning beginners are learning
Overview of machine learning techniques learned from scikit-learn
About the development contents of machine learning (Example)
Summary of evaluation functions used in machine learning
Analysis of shared space usage by machine learning