Find out about SVM

This is a memo I wrote for investigating the relationship between scikit-learn SVM and other SVMs, and it has not been organized yet. I am attracted to the convenience of scikit-learn, but for use in the C ++ language, it is necessary to refer to other implementations as well, and I am investigating.

Recently, deep learning has become a hot topic in machine learning, Support Vector Machines (SVMs) are capable and easy to use. I don't think it has lost its brilliance.

** The part that doesn't matter which implementation you use **

Master SVM! 8 checkpoints http://qiita.com/pika_shi/items/5e59bcf69e85fdd9edb2

Support vector machines and other machine learning techniques http://qiita.com/ynakayama/items/afa2212cf561f2067606

It is important for the success of machine learning to have sufficient knowledge about the pre-processing for utilizing SVM and the cross-validation part, which does not depend on which implementation is used.

For those who don't have trouble with English, we recommend the following documents. A Practical Guide to Support Vector Classification

Principal component analysis may be used before SVM. The characteristics of principal component analysis are

  1. Since the eigenvectors obtained as a result of principal component analysis are orthogonal, the obtained coefficients are It is a combination of variables that have no correlation.   Building a model using variables that have correlations tends to cause unintended phenomena, but parameters that are orthogonal and uncorrelated become easier to handle (Note).

  2. It quantitatively shows what kind of fluctuations are included in the actual data fluctuations, and how much of the fluctuations can be explained by what order of eigenvectors. Therefore, although it is possible in principle, the variable component that can be virtually ignored becomes clear, and it becomes possible to capture the phenomenon by focusing only on the component that should be focused on.

Therefore, principal component analysis is used for the purpose of dimensional compression. The interesting thing about dimensional compression is that learning with dimensional compression may give better machine learning results than learning without dimensional compression.

Machine Learning with Scikit Learn (Part I) If the original image data is used as it is, the accuracy is only about 40%, but if it is dimensionally compressed into an eigenface and then classified, the accuracy is about 85%.

You can see what kind of data was obtained as a result of dimensional compression by reprojecting the dimensionally compressed data back into the original space. An example can be seen as an image in the following article about PCA. Using PCA of sklearn.decomposition How to do PCA with Python

PCA (n_components = 2) in scikit-learn PCA Specify the number of dimensions after dimension reduction with n_components and so on.

Article implementing PCA with numpy Code for principal component analysis with Python

The OpenCV PCA class in Python is the cv2.PCACompute class. OpenCV [cv2.PCACompute](http://docs.opencv.org/3.1.0/d3/d8d/classcv_1_1PCA.html#gsc.tab=0) cv2.PCACompute(data[, mean[, eigenvectors[, maxComponents]]]) -> mean, eigenvectors Specify the number of dimensions after dimension reduction with maxComponents.

Article using PCA with OpenCV / C ++ Principal component analysis of images [I tried principal component analysis with OpenCV] (http://www.yasutomo57jp.com/2010/10/26/opencv%E3%81%A7%E4%B8%BB%E6%88%90%E5%88%86%E5%88%86%E6%9E%90%E3%82%92%E3%81%97%E3%81%A6%E3%81%BF%E3%81%9F/)

Implementation of Principal Component Analysis (PCA) by OpenCV

[libpca C++ library: A C++ library for principal component analysis] (https://sourceforge.net/projects/libpca/)

The coefficient of the eigenvector in the principal component analysis obtained in this way becomes smaller in absolute value as the order increases. If left untouched, parameters with small absolute values will not be fully utilized by the SVM. Therefore, in SVM, the input data is standardized and the size is made uniform.

In the field of machine learning, it is often the case that a projection is calculated for another data using the result fitted by principal component analysis and used as an input value for machine learning. Let's check the framework for saving and loading the data so that it can be projected by using the data of the principal component analysis result.

To understand what kind of linear algebra the principal component analysis actually does, the explanation of the principal component analysis PCA in "Practical Computer Vision" is easy to understand. If you just use scikit-learn or OpenCV code, what kind of linear algebra you are doing is hidden. In principal component analysis when the number of samples is small compared to the number of dimensions of the feature quantity, it becomes easy to understand that the size of the memory used for matrix calculation is determined by the dimension of the number of samples. Accumulating understanding of such things will help in solving actual problems.

It is also recommended to check the eigenvectors (or eigenimages in the case of images) when performing principal component analysis. Understanding what fluctuations are involved can give you hints on how to machine learn. It will help you understand what the distribution of training data looks like.

** PCA Library **

scikit-learn OpenCV Other
URL sklearn.decomposition.PCA()
 http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
cv2.PCACompute()
http://docs.opencv.org/2.4.12/modules/core/doc/operations_on_arrays.html?highlight=pca#cv2.PCACompute
Implementation example by numpy
http://www.yasutomo57jp.com/2012/02/24/python%E3%81%A7pca/
projection pca.transform(X) pcacompute.project(X)
Principal component analysis on images Eigenface http://scikit-learn.org/stable/auto_examples/decomposition/plot_faces_decomposition.html Principal component analysis of images http://suzuichibolgpg.blog.fc2.com/blog-entry-62.html Font principal component analysis example http://www.oreilly.co.jp/pub/9784873116075/
iris data PCA example with Iris Data-set http://scikit-learn.org/stable/auto_examples/decomposition/plot_pca_iris.html
Dimensional compression code example from sklearn import decomposition
pca = decomposition.PCA(n_components=3)
pca.fit(X)
pca.transform(X)
cv2.PCACompute(data[, mean[, eigenvectors[, maxComponents]]]) -> mean, eigenvectors

** SVM library **

Below are three examples of SVM libraries that can be used in a python environment. Please use it properly according to the desired work. ・ Scikit-learn: An environment with a rich machine learning library. The interface used in Python has a high degree of commonality even if the learning algorithm is changed, so it is convenient for trial and error. ・ LibSVM: LibSVM developers have published a python binding. It corresponds to sparse learning data. ・ OpenCV: It is convenient to move to the OpenCV / C ++ environment in the future. However, the OpenCV-Python binding interface is volatile and may not keep up with the documentation. ・ Dlib There is also an SVM implementation in dlib, which can be used in both C ++ and Python. http://dlib.net/python/#dlib.svm_c_trainer_histogram_intersection

LIBLINEAR --A Library for Large Linear Classification

The following table is under construction

scikit-learn libSVM OpenCV
URL http://scikit-learn.org/stable/ https://www.csie.ntu.edu.tw/~cjlin/libsvm/ http://opencv.org/
Module loading from sklearn import svm from svm import * import cv2
Learning clf = svm.SVC(C=22, gamma=2-11) m = svm_train(prob, param) svm = cv2.SVM()
Learning clf.fit(x,y) m = svm_train(prob, param) svm.train(x, y)
Judgment clf.predict(x) result = svm_predict(test_label, test_data , t) result = svm.predict_all(testData)
Multi-class 1 to 1 1 to 1 1 to 1
Save model s = pickle.dumps(clf) svmutil.svm_save_model('libsvm.model', m) svm.save("/path/to/model.xml")
Model loading clf2 = pickle.loads(s) m = svmutil.svm_load_model('libsvm.model') svm2 = cv2.SVM();svm2.load("/path/to/model.xml")
With or without probability estimation Yes classifier= svm.SVC(gamma=0.001, probability=True); classifier.fit(X, Y);predicted_Prob = classifier.predict_proba(Xnew) Yes
iris data example http://scikit-learn.org/stable/auto_examples/svm/plot_iris.html Location of data https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#iris
C/C++With or without None Yes Yes
license commercially usable - BSD license the modified BSD license 3-clause BSD License

** Articles that use the implementation of scikit-learn **

Since Scikit-learn has a library related to machine learning Easy to get started with SVM.

scikit-learn [Recognizing hand-written digits] (http://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html#example-classification-plot-digits-classification-py) scikit-learn [1.4. Support Vector Machines] (http://scikit-learn.org/stable/modules/svm.html) scikit-learn [Plot different SVM classifiers in the iris dataset] (http://scikit-learn.org/stable/auto_examples/svm/plot_iris.html)

The sample code on the blog may be written from a different perspective than the sample code on the scikit-learn site.

Python: Try to classify scikit-learn handwritten digit dataset by SVM http://blog.amedama.jp/entry/2016/01/03/143258

scikit-learn Recognizing hand-written digits

Multi-class SVM with scikit-learn http://qiita.com/sotetsuk/items/3a5718bb1f945a383ceb

Try multi-class classification using scikit-learn SVM http://minus9d.hatenablog.com/entry/2015/04/19/190732

[Python: Try to classify Iris datasets with support vector machines] (http://momijiame.tumblr.com/post/114751531866/python-iris-%E3%83%87%E3%83%BC%E3%82%BF%E3%82%BB%E3%83%83%E3%83%88%E3%82%92%E3%82%B5%E3%83%9D%E3%83%BC%E3%83%88%E3%83%99%E3%82%AF%E3%82%BF%E3%83%BC%E3%83%9E%E3%82%B7%E3%83%B3%E3%81%A7%E5%88%86%E9%A1%9E%E3%81%97%E3%81%A6%E3%81%BF%E3%82%8B)

scikit-learn Plot different SVM classifiers in the iris dataset

It is written that n-class identification is implemented on a one-to-one basis. scikit-learn [SVC and NuSVC implement the “one-against-one” approach (Knerr et al., 1990) ] (http://scikit-learn.org/stable/modules/svm.html)

It seems that it is easier to achieve accuracy by implementing one-to-one identification of n-class. Implementations on a one-to-other basis will narrow the boundaries that distinguish them in a multidimensional space.

How to save and load a model in [scikit-learn Model persistence] is introduced. (http://scikit-learn.org/stable/modules/model_persistence.html)

The method of saving an instance of a classifier using pickle shown here requires that the saver and the reader use the same version of scikit-learn.

python


import pickle
s = pickle.dumps(clf)
clf2 = pickle.loads(s)

The following article introduces other learning techniques of interest besides SVM. Just looking at the image is quite fun. Machine Learning with Scikit Learn (Part I)

Principal component analysis and SVM seem to be compatible and I found the following example. scikit-learn [Faces recognition example using eigenfaces and SVMs] (http://scikit-learn.org/stable/auto_examples/applications/face_recognition.html)

** Articles using libSVM **

If the algorithms you should use are narrowed down to SVMs and you plan to reimplement them in the C / C ++ language, you might consider using libSVM, a proven library, directly. libSVM is also capable of multi-class learning. In the case of libSVM, learning is possible even if the features are sparse.

libSVM has an exe format that can be used from the command line, a python module that can be used from python, and a module that can be used from C language, so make sure that it works in one layer and then use it in another layer. Seems easy.

LIBSVM Tools Is still updated, so it's worth revisiting those who used to look at it a lot.

LIBSVM FAQ

GPU-accelerated LIBSVM

Use LIBSVM with Python http://hy-adversaria.blogspot.jp/2011/04/pythonlibsvm.html

[Python] Notes on using SVM from python [SVM] http://gasser.blog114.fc2.com/blog-entry-498.html

I wrote FizzBuzz in python using a support vector machine (library LIVSVM). http://qiita.com/cof/items/e02ada0adb1106635ac9

I tried to predict data using LIBSVM https://airtoxin.wordpress.com/2013/02/03/libsvm%E3%82%92%E4%BD%BF%E3%81%A3%E3%81%A6%E3%83%87%E3%83%BC%E3%82%BF%E4%BA%88%E6%B8%AC%E3%81%97%E3%81%A6%E3%81%BF%E3%81%9F/

Introduction to Python for toto prediction [Part 3] Using SVM with Python (Part 2) Clip an article http://blogs.yahoo.co.jp/gdg00431/2211049.html

Install libsvm with python http://pumpkinkaneko.com/python%E3%81%A7libsvm%E3%82%92%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%BC%E3%83%AB%E3%81%99%E3%82%8B

How to use LIBSVM (python) http://liberte599.jimdo.com/2011/02/25/libsvm-python-%E3%81%AE%E4%BD%BF%E3%81%84%E6%96%B9/

livSVM FAQ also one-against-one. It states that it is an implementation of.

Save and load learning results svmutil.svm_save_model('libsvm.model', m) m = svmutil.svm_load_model('libsvm.model')

With libSVM, it is possible to return not only the judgment result but also the probability at the time of judgment. Blog article Details of MNIST handwritten digit classification results by SVM

** Articles using OpenCV SVM **

[OpenCV-Python Tutorials >> Machine Learning >> Understanding SVM] (http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_ml/py_svm/py_svm_basics/py_svm_basics.html#svm-understanding)

OpenCV 2.4.11 Documentation [Support Vector Machines] (http://docs.opencv.org/2.4/modules/ml/doc/support_vector_machines.html#cvsvm-cvsvm)

As far as I read the docs, it seems that it also supports multi-class classification. In fact, I ran a later example to see it. (There is an expression called n-class classification.) OpenCV SVMs are available from python as well as from the C ++ language. From python, you can easily create a graph using matplotlib, so it's easy to check the result. Once the method is found to work, graphing can be skipped and focused solely on individual learning and identification. At that time, the implementation in C ++ language will be lean and high-speed processing will be possible.

Take a look at the samples included in the standard OpenCV distribution. An SVM class that is different from cv2.SVM is defined. (The same interface is given to make it easier to compare with multiple methods.) (opencv directory) \ sources \ samples \ python2 digits.py (opencv directory) \ sources \ samples \ python2 letter_recog.py

OCR of Hand-written Digits http://docs.opencv.org/3.1.0/dd/d3b/tutorial_py_svm_opencv.html#gsc.tab=0

svm = cv2.SVM() svm.train(trainData,responses, params=svm_params) result = svm.predict_all(testData)

Let's check the location of the data to make this script work.

[Svm in python] (http://ffuyyo.blogspot.jp/2012/08/pythonsvm.html)

Usage example from C ++ OpenCV diary (9) Non-linear SVM using kernel method

With OpenCV2 "train data must be floating-point matrix" When you get the error It seems good to change the array data of numpy given to train () to float32. Blog article about SVM of OpenCV svm with python Wrote the same thing

If you try to use it and get an error, you can search for the error message in the original C ++ module. It may be the fastest way to deal with it. Return value of predict () for SVM class in OpenCV-Python

cv2.SVM.predict(sample[, returnDFVal]) → retva cv2.SVM.predict_all(samples[, results]) → results

The shape of numpy.array of the return value of predict_all () of SVM class in OpenCV-Python is different from predict () of SVM of scikit-learn. Save and load learning results Since it is saved as an XML file as follows, it seems that there are few dependencies depending on the version of OpenCV. However, as it is, it is not suitable for concealing the learning result, so some ingenuity is required. The save () result file in digits.py above was YAML.

model = cv2.SVM(points, labels)
# Store it by using OpenCV functions:
model.save("/path/to/model.xml")
# Now create a new SVM & load the model:
model2 = cv2.SVM()
model2.load("/path/to/model.xml")

[Answers.opencv.org/ Save and load SVM model of OpenCV] (http://answers.opencv.org/question/5713/save-svm-in-python/) You can also pickle an instance of SVM with OpenCV Python bindings.

python


pickle.dump(model, open("save.pkl", 'w'))

This is not a usage of SVM alone A HOG + SVM detector is available in OpenCV. Both CPU version and GPU version are also available. For documentation, refer to the GPU and use the CPU version.

Object Detection gpu::HOGDescriptor

[Person detection with HOG + SVM [OpenCV & Python]] (http://yusuke1581.hatenablog.com/entry/2014/11/26/153901)

Is OpenCV 3.1 released and is the transition from OpenCV 2.x progressing?

OpenCV3.1 system article? [Support Vector Machines (SVM)](http://wn55.web.fc2.com/cv2_SVM.html#Support Vector Machines (SVM, -Support-Vector-Machines))

** Learning result evaluation tool (scikit-learn.metrics) **

・ Evaluate the learning results using the data that was not used for learning. Machine learning without evaluation is impossible. Often, it gives good results only for the data used for training, For other results, it gives a brutal result. Read "Practical Machine Learning System". You can see how important the data preparation and evaluation part is. Whether you're using a cv2 SVM, your own numpy python code, or something else scikit-learn.metrics is useful.

sklearn.metrics Classification metrics

Recognizing hand-written digits In sklearn.metrics.classification_report (expected, predicted) sklearn.metrics.confusion_matrix(expected, predicted) It is meaningful to know how is used.

** Data is important **

In the field of machine learning, data is still important. In the case of multi-class classification, by focusing on both the recall rate value and the precision value in sklearn.metrics.classification_report, it is possible to know which classification of training data is missing. By supplementing the training data of the insufficient classification, the recall rate value and precision value should be improved. By repeating such a process, it becomes a level that can be used for actual problems.

Reference information:

** Library equivalent to scikit-learn in C ++ **

In response to the question of whether there is a library equivalent to scikit-learn in C ++, the following library was introduced. If you need a library equivalent to scikit-learn in C ++, investigate it.

The Shogun Machine Learning Toolbox Japanese commentary article Introduction to Machine Learning Library SHOGUN Japanese commentary article How to use shogun Shark – Machine Learning 3.0

** Machine learning software Weka written in Java **

Weka

** Note: Do not correlate parameters **

Making the parameters uncorrelated is also common in the field of multivariate analysis. The combination of (height, BMI) is less correlated (or can be ignored) than the combination of (height, weight), so it is suitable as an explanatory variable for multivariate analysis.

Do not use variables with high correlation between explanatory variables

If the correlation between the explanatory variables is high, the result may not be possible.

If the correlation between the explanatory variables is very high, the regression model becomes very unstable.

If the correlation between the explanatory variables is very high, the regression model becomes very unstable. This is an explanation It means that there is already another linear regression relationship between the variables, and in that sense this is Such a phenomenon is called "multi colinearity". Empirically, the correlation between explanatory variables If it is 0.7 or more, it is said to be dangerous. To pay attention to multicollinearity, when performing regression analysis, first look at the correlation matrix between the explanatory variables. If there is a very strong correlation, it is necessary to exclude one from the explanatory variables. To.

** Note: Freedom when writing the model **

In terms of how much freedom should be given when describing a model, the amount called AIC (Akaike's Information Criterion) is a guideline for evaluation. [Akaike Information Criterion](https://ja.wikipedia.org/wiki/%E8%B5%A4%E6%B1%A0%E6%83%85%E5%A0%B1%E9%87%8F% E8% A6% 8F% E6% BA% 96)

As the number and order of parameters are increased, the degree of compatibility with the measurement data can be improved. However, on the other hand, it does not fit the same kind of data because it is forced to match accidental fluctuations such as noise (unrelated to the structure to be measured).

** For those who are interested in mathematics **

It is wonderful that SVM projects to a multidimensional space and makes it separable, and that the calculation in the multidimensional space is realized by a method called kernel trick without increasing the amount of calculation. By the way.

Cuda GPU-accelerated LIBSVM http://mklab.iti.gr/project/GPU-LIBSVM

Postscript (2019.06.03)

"Practical machine learning with scikit-learn and TensorFlow" In the table of contents Chapter 5 Support Vector Machine (SVM) Chapter 8 Dimensionality Reduction There is. Principal component analysis (PCA) explains the standard of dimensionality reduction and support vector machine (SVM) classification.

Postscript: Why SVM is useful even with deep learning

When determining whether or not learning has been possible, if we only consider that it is sufficient to separate the data used for learning and evaluation, it will be OK if the data can be separated, no matter how close it is. Unless you make a mistake, it will not contribute to the correction. That's most deep learning situations. That is why it is known that classification can be deceived by adding minute noise that is not noticeable to humans. Attack method against Convolutional Neural Network-Induction of misclassification-

On the other hand, in SVM, a mechanism that takes a large boundary area when classifying ( There is Maximize Margin. Therefore, if it can be formulated with SVM, I think it is better to use SVM.

Reference article SVM implementation with python

"Then, how does it compare to other methods?"

SlideShare [Introduction to Machine Learning with Python-From Basics to Deep Learning-] (http://www.slideshare.net/yasutomo57jp/pythondeep-learning-60544586?next_slideshow=1)

Recommended Posts

Find out about SVM
Find out about launching Anaconda Prompt
Find out about Cinema 4D's Python environment
Let's find out about ROS (Robot Operating System)
I tried to find out the outline about Big Gorilla
python beginners tried to find out
Find out the CentOS7 MAC address (HWADDR)
Finding out about file permissions and superuser