Supervised Learning 3 Hyperparameters and Tuning (2)

Aidemy 2020/9/27

Introduction

Hello, it is Yope! I am a liberal arts student, but I was interested in the possibilities of AI, so I went to the AI-specialized school "Aidemy" to study. I would like to share the knowledge gained here with you, and I am summarizing it on Qiita. I am very happy that many people have read the previous summary article. Thank you! This is the third post of supervised learning. Nice to meet you.

What to learn this time -Decision tree, random forest, hyperparameters of k-NN ・ Automation of tuning (parameter adjustment)

Decision tree hyperparameters

Parameter max_depth

-Max_depth indicates the depth of the tree that the __model learns __. If this value is not set or the value is too large, it will be too close to the teacher data and will not be generalized, so limit max_depth to generalize it. This is called "decision tree pruning".

Parameter random_state

-Since the decision tree has a hierarchical structure, the data extracted at the beginning accounts for a larger proportion of the total. Therefore, the effect on the result of __random_state that specifies the order of data retrieval is larger than that of other models. __

Random forest hyperparameters

Parameter n_estimators

-Random forest is a model that "creates multiple decision trees of random data and outputs the class with the largest number of classification results as a result", but n_estimators is created at this time __ " The number of decision trees "__ is shown.

Parameter max_depth

-Although the depth of the decision tree can be set even in a random forest, it is better not to deepen each one because there are multiple decision trees. Therefore, the value should be smaller than the normal decision tree. __

Parameter random_state

-In Random Forest, random numbers are also used for data extraction, as in "Create multiple decision trees for random data", so if the value of __random_state changes, the analysis results will differ greatly. __

k-NN hyperparameters

Parameter n_neighbors

-K-NN is a model that "extracts k teacher data similar to the prediction data and outputs the most common class as the prediction result", but n_neighbors indicates the value __ of this __k. .. That is, "the number of teacher data to be compared for prediction with respect to one data".

Automation of tuning (parameter adjustment)

・ It is very troublesome to tune all parameters while changing their values. Therefore, you can save time by specifying the __ parameter range and having the computer find the parameter set with the best results. -There are two types of automation: __ "grid search" and "random search" __.

Grid search

-Grid search is a __ method that specifies multiple candidates for __ parameter values in advance, evaluates each set, and adopts the set with the best results. -Often used when the value is "character string", "integer", "True / False", etc. (because it is easy to specify explicitly). -Actually perform grid search with nonlinear SVM. The method is to pass each parameter as a dictionary key and a list of values when creating the __model. __

from sklearn.model_selection import GridSearchCV
#Parameter candidates (kernel is other than precomputed, C is-5~10 to the i-th power with up to 5 as an index)
set_grid = {SVC():{"kernel":["linear","poly","rbf","sigmoid"],"C":[10**i for i in range(-5,5)],"decision_function_shape":["ovr","ovo"],"random_state":[0]}}
#Define the "correct answer rate" and "parameters used" with the best results to be stored later
max_score = 0
best_param = None
#set_Perform grid search with "SVC ()" of grid as "model" and "parameter (candidate)" as "param".
for model,param in set_grid.items():
    search_model = GridSearchCV(model, param)
    search_model.fit(train_X, train_y)
#Calculate the correct answer rate and store the parameter set with the best results and the correct answer rate.
    pred_y = search_model.predict(test_X)
    score = f1_score(test_y,pred_y,average="micro")
    if max_score<score:
        max_score = score
        best_model = model.class.name
        best_param = searchmodel.best_param

Random search

-Random search is a __ method that specifies a range of __ parameter values, sets values randomly from within that range, evaluates the model, and adopts the parameter set with the best results. -Use the __probability function __ to specify the range. This is imported from the scipy.stats module. -Actually perform grid search with nonlinear SVM. The method is OK with the same flow as grid search.

import scipy.stats as sc
from sklearn.model_selection import RandomizedSearchCV
#Parameter candidates (C is 0 on average).00001,Uniform distribution with standard deviation of 1000, random_state takes an integer value from 0 to 100 at random)
set_rondom = {SVC():{"kernel":["linear","poly","rbf","sigmoid"],"C":sc.uniform(0.00001,1000),"decision_function_shape":["ovr","ovo"],"random_state":sc.randint(0,100)}}
#Define the "correct answer rate" and "parameters used" with the best results to be stored later(The following is almost the same as the grid)
max_score = 0
best_param = None
#set_Random search is performed with "SVC ()" of the grid as "model" and "parameter (candidate)" as "param".
for model,param in set_random.items():
    search_model = RandomizedSearchCV(model, param)
    search_model.fit(train_X, train_y)
#Calculate the correct answer rate and store the parameter set with the best results and the correct answer rate.
    pred_y = search_model.predict(test_X)
    score = f1_score(test_y,pred_y,average="micro")
    if max_score<score:
        max_score = score
        best_model = model.__class__.__name__
        best_param = search_model.best_params_

Difficulty of hyperparameter adjustment

-Automatic parameter adjustment basically uses a method called the gradient method, in which the value is moved so as to reduce the loss function. In this gradient method, there is a pseudo solution __ saddle point __, and if you get hooked on this, you will not be able to reach the original solution. ・ However, since this loss function changes depending on the case, it is necessary to decide the __ adjustment while trying. __

Summary

-Hyperparameters of decision trees include "random_state" in addition to __ "max_depth" __ that limits the depth of the tree. -Random forest hyperparameters include "max_depth" and "random_state" in addition to __ "n_estimators" __ that specify the number of decision trees. -The hyperparameters of k-NN include __ "n_neighbors" __ that specifies the number of teacher data to be compared for prediction. -Tuning can be automated by __ "grid search" "random search" __.

This time is over. Thank you for reading this far.

Recommended Posts

Supervised Learning 3 Hyperparameters and Tuning (2)
Supervised learning 2 Hyperparameters and tuning (1)
Python: Supervised Learning: Hyperparameters Part 2
Supervised learning (classification)
Machine Learning: Supervised --AdaBoost
Python: Deep Learning Tuning
Supervised learning (regression) 1 Basics
Python: Supervised Learning (Regression)
Python: Supervised Learning (Classification)
Tuning hyperparameters with LightGBM Tuner
Ensemble learning and basket analysis
Machine Learning: Supervised --Linear Regression
Deep running 2 Tuning of deep learning
Machine Learning: Supervised --Random Forest
Supervised learning 1 Basics of supervised learning (classification)
Supervised learning (regression) 2 Advanced edition
Machine learning and mathematical optimization
Machine Learning: Supervised --Support Vector Machine
Supervised learning ~ Beginner's memo ~ (scikit-learn)
Supervised machine learning (classification / regression)
Learning model creation, learning and reasoning
Machine Learning: Supervised --Decision Tree
Significance of machine learning and mini-batch learning
Python: Application of supervised learning (regression)
Random forest (classification) and hyperparameter tuning
Classification and regression in machine learning
Organize machine learning and deep learning platforms
Machine Learning: Supervised --Linear Discriminant Analysis