○ The main points of this article Note that I learned the support vector machine
Support vector machine: -Algorithms that can be used for both classification problems and regression problems ・ Better results may be obtained than using logistic regression. ・ A method to obtain a better boundary by increasing the margin (the distance between the training data closest to the decision boundary and the decision boundary). -There are hard margin and soft margin methods. Set with hyperparameters Hard Margin: A technique that does not allow data to get inside the margin. It draws too much line, which can lead to overfitting. Soft Margin: A technique that allows data to get inside the margin. flexible. ・ Supervised learning
Support vector machine
from sklearn.svm import LinearSVC
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
%matplotlib inline
#Data generation
centers = [(-1, -0.125), (0.5, 0.5)]
X, y = make_blobs(n_samples=50, n_features=2, centers=centers, cluster_std=0.3)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
#Model creation, learning and evaluation
model = LinearSVC()
model.fit(X_train, y_train) #Learning
y_pred = model.predict(X_test)
accuracy_score(y_pred, y_test) #Evaluation
print(y_test) #Correct label for test data
print(y_pred) #Correct label for forecast data
print(accuracy_score(y_pred, y_test)) #Correct answer rate
#Scatter plot of test data
fig, ax = plt.subplots()
ax.scatter(X_test[:, [0]], y_test[:], c='blue', label='test data')
ax.legend()
result [1 0 0 1 0 0 0 1 1 0 0 1 1 1 0] [1 0 0 1 0 0 0 1 1 0 0 1 1 1 0] 1.0
-The test data and the prediction data are exactly the same, and the correct answer rate is 100%. ・ However, it cannot be said that the above is a good model due to the lack of data. It's just a test code.
Recommended Posts