Recently, I've been a beginner for about 3 months since I started to use machine learning.
I want to use tensorFlow, but I still feel that the threshold is high, so I am currently using "scikit-learn" which has everything necessary for machine learning. (I would like to shift to tensorFlow in the future.)
Here, I will leave a personal note on how to handle the created model in other languages.
I would be grateful if you could point out any mistakes.
As a sample, the following is targeted.
Initially I chose the following algorithm:
Looking at the prepared data, I felt that linear separation was not possible, so I trained with a non-linear SVM. The kernel used is the Gaussian kernel.
Looking at the confusion matrix of the training data, the correct answer rate was about 99%.
However, when I looked at the confusion matrix of the test data, I created a situation where all 0s or 1s were predicted. So-called overfitting was occurring. (It makes sense to overfit considering the relationship between the number of samples prepared and the number of dimensions. It may be possible to deal with it to some extent by adjusting the hyperparameters.)
So, just recently, a neural network compatible class was implemented in scikit-learn, so I tried using it.
Class used: MLPClassifier Official page: http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html
As a result, it learned properly and the generalization performance was higher than that of the nonlinear SVM.
From the conclusion, if you know the "weight" and "bias", it is almost a goal.
The code below shows how to get it from the MLPClassifier.
weight
MLPClassifier#coefs_
bias
MLPClassifier#intercepts_
As the method ends in the plural, if there are multiple layers of the neural network, you can get the weight and bias for that layer.
When the neural network is input layer (100 dimensions) → hidden layer 1 (200 dimensions) → hidden layer 2 (300 dimensions) → output layer
The matrix that can be obtained with MLPClassifier # coefs_
should be a matrix with a size of 100x200 if it is a network of input layer (100 dimensions) → hidden layer 1 (200 dimensions).
In other words, if you can read this matrix in another language, it's okay.
Actually, I want to incorporate it into Rails, so I am doing matrix calculation using Matrix and Vector classes.
#Similar reads occur as many times as layers
file_path = File.expand_path('app/models/hogehoge.csv', ENV['RAILS_ROOT'])
csv_data = CSV.read(file_path, headers: false)
matrix = Matrix.rows(csv_data, true)
#The first row and the first column correspond to the header, so omit the others ...
matrix = matrix.first_minor(0, 0).map(&:to_f)
It was a problem that the classifier currently supported can classify even if the number of dimensions is not so large & the layer is not deep. However, if the number of dimensions exceeds 10,000 and the number of layers is tremendous in the future, it remains a question whether this method can be used.
Recommended Posts