Multi-class SVM with scikit-learn

Multi-class SVM with scikit-learn

Purpose

SVM (SVC) of scikit-learn classifies by one-versus-one when performing multi-class classification. However, one-versus-the-rest may have better discrimination performance (I see reports that there are many), so Using OneVsRestClassifier of sklearn.multiclass Make a note of how to ** multi-class SVM classification in one-versus-the-rest **. (Note) However, LinearSVC uses one-versus-the-rest by default.

One-versus-the-rest and One-versus-one

Consider the $ K $ classification problem.

One-versus-the-rest Use $ K $ classifiers to solve the two-class classification problem of entering a specific class or entering any of the other $ K-1 $ classes.

One-versus-one Use $ K (K-1) / 2 $ classifiers to solve the two-class classification problem of entering a specific class or entering another specific class.

Multi-class SVM

Using the digits data set, 10 class classification of handwritten characters is performed by SVM of RBF kernel.

Package import

python


from sklearn.datasets import load_digits
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score

Data reading

python


digits = load_digits()
train_x, test_x, train_y, test_y  = train_test_split(digits.data, digits.target)

Hyperparameter settings

python


C = 1.
kernel = 'rbf'
gamma  = 0.01

Identification by One-versus-the-rest

python


estimator = SVC(C=C, kernel=kernel, gamma=gamma)
classifier = OneVsRestClassifier(estimator)
classifier.fit(train_x, train_y)
pred_y = classifier.predict(test_x)

Identification by one-versus-the-one (default)

python


classifier2 = SVC(C=C, kernel=kernel, gamma=gamma)
classifier2.fit(train_x, train_y)
pred_y2 = classifier2.predict(test_x)

Identification result

python


print 'One-versus-the-rest: {:.5f}'.format(accuracy_score(test_y, pred_y))
print 'One-versus-one: {:.5f}'.format(accuracy_score(test_y, pred_y2))

One-versus-the-rest: 0.95333 One-versus-one: 0.79111

One-versus-the-rest shows higher discrimination performance.

Related Links

pylearn2.models.svm (sklearn wrapper) sklearn.multiclass.OneVsRestClassifier Ex. sklearn.multiclass.OneVsRestClassifier sklearn.svm

Recommended Posts

Multi-class SVM with scikit-learn
Try machine learning with scikit-learn SVM
Isomap with Scikit-learn
SVM (multi-class classification)
DBSCAN with scikit-learn
Clustering with scikit-learn (1)
Clustering with scikit-learn (2)
PCA with Scikit-learn
kmeans ++ with scikit-learn
Try SVM with scikit-learn on Jupyter Notebook
[Python] Use string data with scikit-learn SVM
Cross Validation with scikit-learn
Clustering with scikit-learn + DBSCAN
Learn with chemoinformatics scikit-learn
DBSCAN (clustering) with scikit-learn
Install scikit.learn with pip
Calculate tf-idf with scikit-learn
Neural network with Python (scikit-learn)
Parallel processing with Parallel of scikit-learn
[Python] Linear regression with scikit-learn
Robust linear regression with scikit-learn
Grid search of hyperparameters with Scikit-learn
Creating a decision tree with scikit-learn
Image segmentation with scikit-image and scikit-learn
Identify outliers with RandomForestClassifier in scikit-learn
Laplacian eigenmaps with Scikit-learn (personal notes)
Non-negative Matrix Factorization (NMF) with scikit-learn
Scikit-learn DecisionTreeClassifier with datetime type values
The most basic clustering analysis with scikit-learn
Multi-class, multi-label classification of images with pytorch
[Scikit-learn] I played with the ROC curve
Multi-label classification by random forest with scikit-learn
Clustering representative schools in summer 2016 with scikit-learn
Implement a minimal self-made estimator with scikit-learn
Fill in missing values with Scikit-learn impute
Visualize scikit-learn decision trees with Plotly's Treemap