table of contents Chapter 1: Linear Regression Model [Chapter 2: Nonlinear Regression Model] (https://qiita.com/matsukura04583/items/baa3f2269537036abc57) [Chapter 3: Logistic Regression Model] (https://qiita.com/matsukura04583/items/0fb73183e4a7a6f06aa5) [Chapter 4: Principal Component Analysis] (https://qiita.com/matsukura04583/items/b3b5d2d22189afc9c81c) [Chapter 5: Algorithm 1 (k-nearest neighbor method (kNN))] (https://qiita.com/matsukura04583/items/543719b44159322221ed) [Chapter 6: Algorithm 2 (k-means)] (https://qiita.com/matsukura04583/items/050c98c7bb1c9e91be71) [Chapter 7: Support Vector Machine] (https://qiita.com/matsukura04583/items/6b718642bcbf97ae2ca8)
y=w^Tx+b=\sum_{j=1}^{m} w_jx_j+b
The distance between the linear discriminant function and the closest data point is called the margin
Find the linear discriminant function that maximizes the margin
Margin depends on the parameter. SVM decision function
y(x)=w^Tx+b=\sum_{j=1}^{m} a_it_ix~Tx+b
Support vector
The training data that composes the separation hyperplane is only the support vector, and the rest of the data is unnecessary.
Soft margin SVM
When the sample cannot be linearly separated
Tolerate errors and penalize them
Supports cases where linear separation is not possible
The decision boundary changes depending on the magnitude of parameter C
Kernel trick
Kernel function
Express the inner product of high-dimensional vectors with a scalar function
Computational cost can be suppressed even if the feature space is high-dimensional
Separation using non-linear kernel
Non-linear separation is possible
Lagrange's undetermined multiplier method
Definition
Regarding the support vector machine, it was easy to understand what I learned in "Machine learning starting with Raspberry Pi", so I will describe it including a memorandum.
#Import various modules you need first
from sklearn import datasets, svm
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
#Load iris data and store in variable iris
iris = datasets.load_iris()
#Store the set of features in the variable X and the target in the variable y
X = iris.data
y = iris.target
#Content display of X and y
print(X)
print(y)
result
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5. 3.4 1.5 0.2]
[4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1]
[5.4 3.7 1.5 0.2]
[4.8 3.4 1.6 0.2]
[4.8 3. 1.4 0.1]
[4.3 3. 1.1 0.1]
[5.8 4. 1.2 0.2]
[5.7 4.4 1.5 0.4]
[5.4 3.9 1.3 0.4]
[5.1 3.5 1.4 0.3]
[5.7 3.8 1.7 0.3]
[5.1 3.8 1.5 0.3]
[5.4 3.4 1.7 0.2]
[5.1 3.7 1.5 0.4]
[4.6 3.6 1. 0.2]
[5.1 3.3 1.7 0.5]
[4.8 3.4 1.9 0.2]
[5. 3. 1.6 0.2]
[5. 3.4 1.6 0.4]
[5.2 3.5 1.5 0.2]
[5.2 3.4 1.4 0.2]
[4.7 3.2 1.6 0.2]
[4.8 3.1 1.6 0.2]
[5.4 3.4 1.5 0.4]
[5.2 4.1 1.5 0.1]
[5.5 4.2 1.4 0.2]
[4.9 3.1 1.5 0.2]
[5. 3.2 1.2 0.2]
[5.5 3.5 1.3 0.2]
[4.9 3.6 1.4 0.1]
[4.4 3. 1.3 0.2]
[5.1 3.4 1.5 0.2]
[5. 3.5 1.3 0.3]
[4.5 2.3 1.3 0.3]
[4.4 3.2 1.3 0.2]
[5. 3.5 1.6 0.6]
[5.1 3.8 1.9 0.4]
[4.8 3. 1.4 0.3]
[5.1 3.8 1.6 0.2]
[4.6 3.2 1.4 0.2]
[5.3 3.7 1.5 0.2]
[5. 3.3 1.4 0.2]
[7. 3.2 4.7 1.4]
[6.4 3.2 4.5 1.5]
[6.9 3.1 4.9 1.5]
[5.5 2.3 4. 1.3]
[6.5 2.8 4.6 1.5]
[5.7 2.8 4.5 1.3]
[6.3 3.3 4.7 1.6]
[4.9 2.4 3.3 1. ]
[6.6 2.9 4.6 1.3]
[5.2 2.7 3.9 1.4]
[5. 2. 3.5 1. ]
[5.9 3. 4.2 1.5]
[6. 2.2 4. 1. ]
[6.1 2.9 4.7 1.4]
[5.6 2.9 3.6 1.3]
[6.7 3.1 4.4 1.4]
[5.6 3. 4.5 1.5]
[5.8 2.7 4.1 1. ]
[6.2 2.2 4.5 1.5]
[5.6 2.5 3.9 1.1]
[5.9 3.2 4.8 1.8]
[6.1 2.8 4. 1.3]
[6.3 2.5 4.9 1.5]
[6.1 2.8 4.7 1.2]
[6.4 2.9 4.3 1.3]
[6.6 3. 4.4 1.4]
[6.8 2.8 4.8 1.4]
[6.7 3. 5. 1.7]
[6. 2.9 4.5 1.5]
[5.7 2.6 3.5 1. ]
[5.5 2.4 3.8 1.1]
[5.5 2.4 3.7 1. ]
[5.8 2.7 3.9 1.2]
[6. 2.7 5.1 1.6]
[5.4 3. 4.5 1.5]
[6. 3.4 4.5 1.6]
[6.7 3.1 4.7 1.5]
[6.3 2.3 4.4 1.3]
[5.6 3. 4.1 1.3]
[5.5 2.5 4. 1.3]
[5.5 2.6 4.4 1.2]
[6.1 3. 4.6 1.4]
[5.8 2.6 4. 1.2]
[5. 2.3 3.3 1. ]
[5.6 2.7 4.2 1.3]
[5.7 3. 4.2 1.2]
[5.7 2.9 4.2 1.3]
[6.2 2.9 4.3 1.3]
[5.1 2.5 3. 1.1]
[5.7 2.8 4.1 1.3]
[6.3 3.3 6. 2.5]
[5.8 2.7 5.1 1.9]
[7.1 3. 5.9 2.1]
[6.3 2.9 5.6 1.8]
[6.5 3. 5.8 2.2]
[7.6 3. 6.6 2.1]
[4.9 2.5 4.5 1.7]
[7.3 2.9 6.3 1.8]
[6.7 2.5 5.8 1.8]
[7.2 3.6 6.1 2.5]
[6.5 3.2 5.1 2. ]
[6.4 2.7 5.3 1.9]
[6.8 3. 5.5 2.1]
[5.7 2.5 5. 2. ]
[5.8 2.8 5.1 2.4]
[6.4 3.2 5.3 2.3]
[6.5 3. 5.5 1.8]
[7.7 3.8 6.7 2.2]
[7.7 2.6 6.9 2.3]
[6. 2.2 5. 1.5]
[6.9 3.2 5.7 2.3]
[5.6 2.8 4.9 2. ]
[7.7 2.8 6.7 2. ]
[6.3 2.7 4.9 1.8]
[6.7 3.3 5.7 2.1]
[7.2 3.2 6. 1.8]
[6.2 2.8 4.8 1.8]
[6.1 3. 4.9 1.8]
[6.4 2.8 5.6 2.1]
[7.2 3. 5.8 1.6]
[7.4 2.8 6.1 1.9]
[7.9 3.8 6.4 2. ]
[6.4 2.8 5.6 2.2]
[6.3 2.8 5.1 1.5]
[6.1 2.6 5.6 1.4]
[7.7 3. 6.1 2.3]
[6.3 3.4 5.6 2.4]
[6.4 3.1 5.5 1.8]
[6. 3. 4.8 1.8]
[6.9 3.1 5.4 2.1]
[6.7 3.1 5.6 2.4]
[6.9 3.1 5.1 2.3]
[5.8 2.7 5.1 1.9]
[6.8 3.2 5.9 2.3]
[6.7 3.3 5.7 2.5]
[6.7 3. 5.2 2.3]
[6.3 2.5 5. 1.9]
[6.5 3. 5.2 2. ]
[6.2 3.4 5.4 2.3]
[5.9 3. 5.1 1.8]]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
It can be confirmed that the number of data samples is 150 and the feature quantity is dimension 4. y has the judgment result of Target of classification. This is also 150 pieces.
print(X.shape)
print(y.shape)
result
(150, 4)
(150,)
Targets are classified as ** not ** iris virginica (2) **, only two types of iris setosa (0) and iris versicolor (1).
#Feature quantity is the length of the outer tepal(sepal length)And width(sepal width)of
#Limited to only 2(To think in two dimensions)
X = X[:,:2]
#Target is 2(iris virginica)Not, 2 types iris setosa(0)And iris versicolor(1)
#Only for(Divide the area into two)
X = X[Target!=2]
Target = Target[Target!=2]
#Support vector machine for classification(Support Vector Classifier)Prepare
clf = svm.SVC(C=1.0, kernel='linear')
#Optimized fit for your data
clf.fit(X, Target)
#####Display the classification result by color coding of the background
#Length of outer tepal(sepal length)And width(sepal width)Minimum value of
#The area expanded by 1 from the maximum value is used as the graph display area.
#Something like display tips
x_min = min(X[:,0]) - 1
x_max = max(X[:,0]) + 1
Target_min = min(X[:,1]) - 1
Target_max = max(X[:,1]) + 1
#Divide the graph display area into grids of 500 vertically and horizontally
# (To color the background according to the classification class)
XX, YY = np.mgrid[x_min:x_max:500j, Target_min:Target_max:500j]
#Scikit the points on the grid-Rearrange to input for learn
Xg = np.c_[XX.ravel(), YY.ravel()]
#The class to which the points of each grid belong(0 or 1)Prediction is stored in Z
Z = clf.predict(Xg)
#Rearrange Z on the grid
Z = Z.reshape(XX.shape)
#Class 0(iris setosa)Is light orange(1, 0.93, 0.5, 1)
#Class 1(iris versicolor)Is light blue(0.5, 1, 1, 1)
cmap01 = ListedColormap([(0.5, 1, 1, 1), (1, 0.93, 0.5, 1)])
#Show background color
plt.pcolormesh(XX, YY, Z==0, cmap=cmap01)
#Set axis label
plt.xlabel('sepal length')
plt.ylabel('sepal width')
#####Display data points in colors according to the target
# iris setosa (Target=0)Extract only the data of
Xc0 = X[Target==0]
# iris versicolor (Target=1)Extract only the data of
Xc1 = X[Target==1]
#Plot iris setosa data Xc0
plt.scatter(Xc0[:,0], Xc0[:,1], c='#E69F00', linewidths=0.5, edgecolors='black')
#Plot iris versicolor data Xc1
plt.scatter(Xc1[:,0], Xc1[:,1], c='#56B4E9', linewidths=0.5, edgecolors='black')
#Get support vector
SV = clf.support_vectors_
#Visualize support vector
# plt.scatter(SV[:, 0], SV[:, 1],
# s=100, facecolors='none', edgecolors='k')
#Shows a red border for points in the support vector
#plt.scatter(SV[:,0], SV[:,1], c=(0,0,0,0), linewidths=1.0, edgecolors='red')
plt.scatter(SV[:,0], SV[:,1], c='black', linewidths=1.0, edgecolors='red')
#Display the drawn graph
plt.show()
Targets are classified as 2 ** not iris setosa (0) **, only 2 types (iris virginica) and iris versicolor (1).
#Reload and set to do the next:
#Load iris data and store in variable iris.
iris = datasets.load_iris()
#Store the set of features in the variable X and the target in the variable y
X = iris.data
y = iris.target
#Feature quantity is the length of the outer tepal(sepal length)And width(sepal width)of
#Limited to only 2(To think in two dimensions)
X = X[:,:2]
#Target is 0(iris setosa)Not,
#In other words, iris versicolor(1)And iris virginica(2)Only for
# (Divide the area into two)
X = X[y!=0]
y = y[y!=0]
#Prepare a support vector machine for classification
clf = svm.SVC(C=1.0, kernel='linear')
#Optimized for data
clf.fit(X, y)
#####Display the classification result by color coding of the background
#Length of outer tepal(sepal length)And width(sepal width)of
#The area expanded by 1 from the minimum value and the maximum value
#Use as a graph display area
x_min = min(X[:,0]) - 1
x_max = max(X[:,0]) + 1
y_min = min(X[:,1]) - 1
y_max = max(X[:,1]) + 1
#Divide the graph display area into grids of 500 vertically and horizontally
# (To color the background according to the classification class)
XX, YY = np.mgrid[x_min:x_max:500j, y_min:y_max:500j]
#Scikit the points on the grid-Rearrange to input for learn
Xg = np.c_[XX.ravel(), YY.ravel()]
#The class to which the points of each grid belong(1 or 2)Prediction is stored in Z
Z = clf.predict(Xg)
#Rearrange on the grid
Z = Z.reshape(XX.shape)
#The background color is changed according to the type
#Class 1(iris versicolor)Is light blue(0.5, 1, 1, 1)
#Class 2(iris setosa)Is light green(0.5, 0.75, 0.5, 1)
cmap12 = ListedColormap([(0.5, 0.75, 0.5, 1), (0.5, 1, 1, 1)])
#Change the background color to a two-color campus of light blue and light green
plt.pcolormesh(XX, YY, Z==1, cmap=cmap12)
#Set axis label Unfortunately, Japanese cannot be used
plt.xlabel('sepal length')
plt.ylabel('sepal width')
#####Display data points in colors according to the target
# iris versicolor (y=1)Extract only the data of
Xc1 = X[y==1]
# iris virginica (y=2)Extract only the data of
Xc2 = X[y==2]
#Plot iris versicolor data Xc1
plt.scatter(Xc1[:,0], Xc1[:,1], c='#56B4E9',linewidth=0.5, edgecolors='black')
#Plot iris virginica data Xc2
plt.scatter(Xc2[:,0], Xc2[:,1], c='#008000',linewidth=0.5, edgecolors='black')
#Get support vector
SV = clf.support_vectors_
#Shows a red border for points in the support vector
plt.scatter(SV[:,0], SV[:,1], c='black', linewidths=1.0, edgecolors='red')
#Display the drawn graph
plt.show()
Let's increase the number of classes by 1 to 3 while keeping the features. It classifies iris setosa (0), iris versicolor (1), and iris virginica (2). By default, scikit-learn classifies by one-to-other method (one-vs-the-rest, ovr).
#Load iris data and store in variable iris
iris = datasets.load_iris()
#Store the set of features in the variable X and the target in the variable y
X = iris.data
y = iris.target
#Feature quantity is the length of the outer tepal(sepal length)And width(sepal width)of
#Limited to only 2(To think in two dimensions)
X = X[:,:2]
#Prepare a support vector machine for classification
clf = svm.SVC(C=1.0, kernel='linear', decision_function_shape='ovr')
# 'auto'If you specify 1/(dimension)Is set. In this case, 1/2=0.5
#clf = svm.SVC(C=1.0, kernel='rbf', gamma='auto', decision_function_shape='ovr')
#When gamma is increased, the boundary becomes a boundary with a large curvature (well bends).
#clf = svm.SVC(C=1.0, kernel='rbf', gamma=1.0, decision_function_shape='ovr')
#Optimized for data
clf.fit(X, y)
#####Display the classification result by color coding of the background
#Length of outer tepal(sepal length)And width(sepal width)of
#The area expanded by 1 from the minimum value and the maximum value
#Use as a graph display area
x_min = min(X[:,0]) - 1
x_max = max(X[:,0]) + 1
y_min = min(X[:,1]) - 1
y_max = max(X[:,1]) + 1
#Divide the graph display area into grids of 500 vertically and horizontally
# (To color the background according to the classification class)
XX, YY = np.mgrid[x_min:x_max:500j, y_min:y_max:500j]
#Scikit the points on the grid-Rearrange to input for learn
Xg = np.c_[XX.ravel(), YY.ravel()]
#The class to which the points of each grid belong(0~2)Prediction is stored in Z
Z = clf.predict(Xg)
#Rearrange on the grid
Z = Z.reshape(XX.shape)
#Class 0(iris setosa)Is light orange(1, 0.93, 0.5, 1)
#Class 1(iris versicolor)Is light blue(0.5, 1, 1, 1)
#Class 2(iris virginica)Is light green(0.5, 0.75, 0.5, 1)
cmap0 = ListedColormap([(0, 0, 0, 0), (1, 0.93, 0.5, 1)])
cmap1 = ListedColormap([(0, 0, 0, 0), (0.5, 1, 1, 1)])
cmap2 = ListedColormap([(0, 0, 0, 0), (0.5, 0.75, 0.5, 1)])
#Show background color
plt.pcolormesh(XX, YY, Z==0, cmap=cmap0)
plt.pcolormesh(XX, YY, Z==1, cmap=cmap1)
plt.pcolormesh(XX, YY, Z==2, cmap=cmap2)
#Set axis label
plt.xlabel('sepal length')
plt.ylabel('sepal width')
#####Display data points in colors according to the target
# iris setosa (y=0)Extract only the data of
Xc0 = X[y==0]
# iris versicolor (y=1)Extract only the data of
Xc1 = X[y==1]
# iris virginica (y=2)Extract only the data of
Xc2 = X[y==2]
#Plot iris setosa data Xc0
plt.scatter(Xc0[:,0], Xc0[:,1], c='#E69F00', linewidths=0.5, edgecolors='black')
#Plot iris versicolor data Xc1
plt.scatter(Xc1[:,0], Xc1[:,1], c='#56B4E9', linewidths=0.5, edgecolors='black')
#Plot iris virginica data Xc2
plt.scatter(Xc2[:,0], Xc2[:,1], c='#008000', linewidths=0.5, edgecolors='black')
#Get support vector
SV = clf.support_vectors_
#Shows a red border for points in the support vector
plt.scatter(SV[:,0], SV[:,1], c=(0,0,0,0), linewidths=1.0, edgecolors='red')
#plt.scatter(SV[:,0], SV[:,1], c='black', linewidths=1.0, edgecolors='red')
#Display the drawn graph
plt.show()
(Reference) How to use the colors of the plt.scatter scatter plot I get a little warning, but I will investigate how the red circle is transparent at a later date.
Related Sites
Chapter 1: Linear Regression Model [Chapter 2: Nonlinear Regression Model] (https://qiita.com/matsukura04583/items/baa3f2269537036abc57) [Chapter 3: Logistic Regression Model] (https://qiita.com/matsukura04583/items/0fb73183e4a7a6f06aa5) [Chapter 4: Principal Component Analysis] (https://qiita.com/matsukura04583/items/b3b5d2d22189afc9c81c) [Chapter 5: Algorithm 1 (k-nearest neighbor method (kNN))] (https://qiita.com/matsukura04583/items/543719b44159322221ed) [Chapter 6: Algorithm 2 (k-means)] (https://qiita.com/matsukura04583/items/050c98c7bb1c9e91be71) [Chapter 7: Support Vector Machine] (https://qiita.com/matsukura04583/items/6b718642bcbf97ae2ca8)
Recommended Posts