Start studying: Saturday, December 7th Books used: Miyuki Oshige "Details! Python3 Introductory Note ”(Sotec, 2017)
Resume from [Numpy array (Ch.15 / p.380)](9th day), Finished until [Classification of handwritten characters (Ch.16 / p.396)](10th day)
I will start machine learning from today.
(1) Divide the learning data into training data and test data. (2) Put the training data and teacher data into the ** learner **. → Trained model (classifier) (3) Put test data and teacher data into a classifier and evaluate the performance.
・ Uses a learning device called scikit-learn -Classification of handwritten characters (using the datasets module of the sklearn package) A package is a collection of ** multiple modules **. ・ Practice using scikit-learn number image data This time, the image data (digits.data) and the teacher data (digits.target) are used separately. 2/3 of the image data is training data, 1/3 is test data The teacher data is also divided so as to correspond to the above. Algorithm uses SVM (Support Vector Machine) SVC ・ When I put the test data in the classifier, the following error occurred Classification metrics can't handle a mix of multiclass-multioutput and multiclass targets → It was solved when I tried everything again. Cause unknown···. During the reproduction, there was an error several times due to the double-byte space, so it may be possible.
-Not limited to gamma, the parameters of the learner can be adjusted with the argument of svm.svc (). In the book, gamma = 0.001 and the accuracy was 96.3%, but when gamma = 1, the accuracy dropped to 9.8%. On the contrary, when gamma = 0.001, the accuracy is 93.2%. It doesn't seem that it should be low. Does the adjustment around here correspond to the tuning that you often hear?
Recommended Posts