It is a work memo when I made a mechanism to relearn, evaluate accuracy, and update the operation model of scikit-learn model using mlflow of python library.

Rough requirements for the environment you want to create

After starting the production operation of the model, it is a mechanism to periodically relearn with new data and update the operation model. In short, this is the image.

Model learning
Periodically create a model using the latest training data and register it as a new version in the repository
Assuming regular batch processing such as weekly / monthly
Model accuracy evaluation
For each version of the model in the repository, perform prediction execution on the evaluation target data and calculate the prediction accuracy.
This also assumes regular batch processing
The accuracy evaluation result is confirmed by a person on the UI.
Model update
For automatic update: The latest version generated by regular re-learning is updated and registered as the production version.
In the case of manual update: The latest accuracy evaluation result is confirmed by a person on the UI, and the specific version judged to be the best is updated and registered as the production version.

Assumed use case example:

Demand forecast: Weekly demand forecast from past data (time series). Every month, re-learning is performed with the latest 2 years of data, and the latest generated version is immediately reflected as an operating model (automatic model update).
Judgment of non-defective products on the manufacturing line: Predict the pass / fail of inspection results from manufacturing conditions. Every week, re-learning is performed with the latest 3 months of data, but the operating model is selected by the manufacturing manager while looking at the model accuracy comparison result (manual model update).

environment

CentOS 8.2.2004 (Core)
Anaconda3 (2020.11-Linux-X84_64)
- Python 3.8.3
- scikit-learn 0.23.1
- sqlite 3.32.3
mlflow 1.13.1

Preparation

Introduced Anaconda

This time, we will add Anaconda to easily develop the model at Jupyter Lab. If you like other Python environment construction methods, please feel free to use them.

#update yum
sudo yum update

#Introducing git(For the introduction of pyenv)
sudu yum install -y git

#Introduction of pyenv
git clone git://github.com/yyuu/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
source ~/.bash_profile

#Check the version of Anaconda you want to install
pyenv install -l | grep anaconda

#Introduction of Anaconda
pyenv install anaconda3-2020.07

#Confirmation of installed Anaconda version
pyenv versions

#Switch to Anaconda environment with python environment
pyenv global anaconda3-2020.07

Introduction of mlflow

mlflow is introduced with pip. At the same time, the prerequisites such as gunicorn and flask will be introduced.

pip install mlflow

Introduction of sqlite3

This time I will use sqlite3 which is the easiest to set up as the backend store of mlflow tracking server, but since it is included in anaconda3-2020.07 which has already been introduced, no additional steps are required.

If you set up the python environment in another way, install sqlite3 with pip.

pip install sqlite3

Start mlflow tracking server

mlflow uses two storage areas.

Backend Store: Storage space for models versioned by "Models". SQLAlchemy database You must use a database that is accessible in URI format.
Artifacts Store: Storage area for the history of experiments (model learning and evaluation) managed by "Experiments"

See the mlflow documentation for more details on both. https://mlflow.org/docs/latest/tracking.html#backend-stores https://mlflow.org/docs/latest/tracking.html#artifact-stores

This time, sqlite3, which can be easily prepared as Backend Store, is used, and Artifacts Store uses a local directory.

#Create directories for backend store and artifacts (optional))
sudo mkdir /mnt/share
sudo chmod 777 /mnt/share

#Start mlflow tracking server
mlflow server --backend-store-uri sqlite:////mnt/share/mlflow.db --default-artifact-root /mnt/share/mlflow_artifacts --host 0.0.0.0 --port 5000

(option) Firewall settings

If you want to access the URL of the mlflow tracking server (http: // IPaddress: 5000 in the above case) from the outside, firewall is suitable. The following is a batch stop of Firewall, which is often done for testing purposes.

#Stop firewalld
systemctl stop firewalld

#Stop firewalld autostart
systemctl disable firewalld

Access to mlflow tracking server

Access the tracking server http: // IPaddress: 5000 started with a web browser and confirm that the mlflow screen is displayed. Both Experiments and Models are still empty.

Model learning

From here, you can operate mlflow with python. The python development environment uses Jupyter Lab introduced by Anaconda. The explanation around Jupyter Lab is omitted.

Data uses scikit-learn iris
Assuming that the model is divided for each type, 3 model types are defined (model_types)
After learning, record information on tracking server with mlflow

Library import and constant definition

#Library import
import pandas as pd
import numpy as np
import copy
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
import mlflow

train_result_file_name is used to output the model training result as a temporary text file and register it as Artifacts.

#Temporary output destination file name of detailed information of each model training
train_result_file_name = '/tmp/train_result.txt'

# mlflow tracking server URL
mlflow_tracking_server_url = 'http://localhost:5000'
mlflow.set_tracking_uri(mlflow_tracking_server_url)

The following is required for automatic model update. Note that you must either run it after set_tracking_uri above or specify tracking_uri ='http: // localhost: 5000' as an argument. Otherwise, you'll get an unknown KeyError and get hooked.

#Client instance of mlflow tracking server (for Production version operation)
client = mlflow.tracking.MlflowClient()
#client = mlflow.tracking.MlflowClient(tracking_uri=mlflow_tracking_server_url)

Definition of model type, objective variable, and explanatory variable

#Model type column name in the training data table ★★★ Case sensitivity needs to be considered ★★★
model_types_column_name = 'mtype'
#Model type
model_types = [
    {'name': 'type1', 'detail': 'Detailed explanation of model pattern 1'},
    {'name': 'type2', 'detail': 'Detailed explanation of model pattern 2'},
    {'name': 'type3', 'detail': 'Detailed explanation of model pattern 3'}
]
#Objective variable column name
target_val_colmun_name = 'target'
#Explanatory variable column name
feature_val_column_names = [
    'sepal length (cm)',
    'sepal width (cm)',
    'petal length (cm)',
    'petal width (cm)'
]

Reading training data

#Use iris as sample data. 150 cases in total
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target_names[iris.target]

#Model type column is generated by random numbers
import random
random.seed(1)

mtype_list = []
for i in range(len(df)):
    mtype_list.append(model_types[random.randint(0,len(model_types)-1)]['name'])

df[model_types_column_name] = mtype_list

If you check with df.head () on JupyterLab, you can check the following data.

Model learning

We are creating three models by looping for each model type. In addition, since it is assumed that it will be retrained regularly, all data will be used for training without splitting test and train. After learning, we also record Experiments using mlflow and register models.

#Get current date(Used for the name of run in Experiments)
from pytz import timezone
import datetime
now_datetime = datetime.datetime.now(timezone('Asia/Tokyo')).strftime('%Y%m%d_%H%M%S')
now_date = now_datetime[0:8]

for model_type in model_types:
    #Cutting out data for each model type
    df_mtype = df[df[model_types_column_name] == model_type['name']]
    
    #Splitting explanatory variables and objective variables
    X = df_mtype[feature_val_column_names]
    y = df_mtype[target_val_colmun_name]

    #Model learning
    model = RandomForestClassifier(max_depth=2, random_state=5, n_estimators=10)
    model.fit(X, y)

    #Accuracy for training data(Reference value)
    predicted = model.predict(X)
    ac_score = accuracy_score(y, predicted)
    con_matrix = confusion_matrix(y, predicted)
    clf_report = classification_report(y, predicted)
    
    #Registration of learning results in mlflow
    mlflow.set_experiment('Model learning history')    
    mlflow_run_name = now_date+'Learning results_' + model_type['name']
    with mlflow.start_run(run_name=mlflow_run_name) as run:
        #Record to run as a parameter
        mlflow.log_param('Model type name', model_type['name'])
        mlflow.log_param('Number of training data', len(X))
        mlflow.log_param('Correct answer rate_Training data',ac_score) # log_metric is included in param because Japanese cannot be used

        #Output the details of the model training result to a file and register it in the artifact of run
        with open(train_result_file_name, 'w') as f:
            print('Model type name:', model_type['name'], file=f)
            print('Objective variable:', target_val_colmun_name, file=f)
            print('Explanatory variable:', feature_val_column_names, file=f)
            print('Number of training data:', len(X), '\n', file=f)
            print('Model accuracy(Training data)', '\n', '------------------------------------------------', file=f)
            print('Confusion matrix(confusion_matrix) :', '\n', confusion_matrix(y,predicted), file=f)
            print('Objective variable label', np.sort(y.unique()), '\n', file=f)
            print('Correct answer rate(accuracy) :', '\n', ac_score, '\n', file=f)
            print('Accuracy report(classification_report) :', '\n', clf_report, '\n', file=f)
        mlflow.log_artifact(train_result_file_name)

        #Register model to run
        mlflow.sklearn.log_model(sk_model=model, artifact_path='model')

    #Register model with Models(Register model)
    model_uri = 'runs:/{}/model'.format(run.info.run_id)
    reg_model = mlflow.register_model(model_uri, model_type['name'])

    #Production model update mode(auto :Automatic updating(Immediate reflection of re-learning model) / manual :Manual update(Responsible person operates on the screen))
    #---------------------------------------------------------------#
    #For production use, this value is obtained from DB etc. This time, I will tentatively hard-code it.
    #---------------------------------------------------------------#
    model_update_mode = 'manual'

    #If the model version is 1, be sure to register it as Production
    if reg_model.version == '1':
        client.transition_model_version_stage(
            name = model_type['name'],
            version = reg_model.version,
            stage = "Production",
            archive_existing_versions = True
        )
    else:
        #Model update mode:Automatic
        if model_update_mode == 'auto':
            client.transition_model_version_stage(
                name = model_type['name'],
                version = reg_model.version,
                stage = "Production",
                archive_existing_versions = True
            )

Learning result (1st time)

This is the state after executing the learning code once. The learning results are recorded in Experiments on the mlflow screen.

(Commentary)

Experiments name is "model learning history" specified in mlflow.set_experiment
3 runs are registered. The Run name is the character string "YYYYMMDD learning result_model type name" specified by the run_name option ofmlflow.start_run ().
For Parameters, the "model type name", "number of training data items", and "correct answer rate_learning data" specified in mlflow.log_param are displayed.
By the way, something like "correct answer rate" should be recorded in mlflow.log_metric. I couldn't specify Japanese in mlflow.log_metric, so I dare to record it in log_param.

This is the screen where you clicked the Start Time link in the model learning history.

The model that is the training result is displayed in the Artifacts part. You can see that the Artifacts Store of this tracking server uses a local directory, and its storage location is / mnt/share/mlflow_artifacts/1/950d96375b044a2383cde334ff86534b/artifacts/model. In the Make Predictions section, you can also see sample code for calling the model and executing the predictions.

You can freely register files in Artifacts with mlflow.log_artifact. This time, the evaluation result of the learning model is registered as a text file.

This is the Models screen. You can see from mlflow.register_model that the trained models of model types types 1-3 are registered as Version 1. Furthermore, by client.transition_model_version_stage, if the registered model is version 1, it is also registered as Production as the initial model. This assumes that the number of model types has increased during operation.

By the way, mlflow also has a label called Staging in addition to Production, but I have not used it this time. It would be nice if this label could be defined freely, but it seems that there is no such function in mlflow at the moment (January 2021).

Learning result (2nd time)

This is the state of Experiments after executing the learning code again. Three new runs are registered in the model learning history.

On the Models screen, Version 2 is registered as the Latest Version. Production remains Version 1 as it is intended for manual updates without automatic model updates.

Model accuracy evaluation

In order to compare the prediction accuracy of each model with respect to the latest evaluation data, the model for evaluation is executed and recorded in Experiments. It is assumed that regular batch execution or manual batch execution is performed in a shorter cycle than model retraining.

The following parts are the same as model learning. The evaluation data should be the latest data, but this time we are using the same data as for learning.

#Library import
import pandas as pd
import numpy as np
import copy
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
import mlflow

#Temporary output destination file name of detailed information of each model evaluation
eval_result_file_name = '/tmp/eval_result.txt'

# mlflow tracking server URL
mlflow_tracking_server_url = 'http://localhost:5000'
mlflow.set_tracking_uri(mlflow_tracking_server_url)

#Client instance of mlflow tracking server (for Production version operation)
#The following is set_tracking_Run after uri or tracking_uri='http://localhost:5000'Must be specified. Otherwise, an unknown KeyError will occur
client = mlflow.tracking.MlflowClient()
#client = mlflow.tracking.MlflowClient(tracking_uri=mlflow_tracking_server_url)

Model type column name in the training data table ★★★ Case sensitivity needs to be considered ★★★
model_types_column_name = 'mtype'
#Model type
model_types = [
    {'name': 'type1', 'detail': 'Detailed explanation of model pattern 1'},
    {'name': 'type2', 'detail': 'Detailed explanation of model pattern 2'},
    {'name': 'type3', 'detail': 'Detailed explanation of model pattern 3'}
]
#Objective variable column name
target_val_colmun_name = 'target'
#Explanatory variable column name
feature_val_column_names = [
    'sepal length (cm)',
    'sepal width (cm)',
    'petal length (cm)',
    'petal width (cm)'
]

#Use iris as sample data. 150 cases in total
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target_names[iris.target]

#Model type column is generated by random numbers
import random
random.seed(1)

mtype_list = []
for i in range(len(df)):
    mtype_list.append(model_types[random.randint(0,len(model_types)-1)]['name'])

df[model_types_column_name] = mtype_list

#Get current date(Used for the name of run in Experiments)
from pytz import timezone
import datetime
now_datetime = datetime.datetime.now(timezone('Asia/Tokyo')).strftime('%Y%m%d_%H%M%S')
now_date = now_datetime[0:8]

From here, it is model execution and accuracy measurement for evaluation. First, call the model to be evaluated and store it in a dictionary variable. This time, for the sake of simplicity, all versions of all model types are evaluated.

#Get all versions of all model types
loaded_models = {}
for model_type in model_types:
    loaded_models_ver = {}
    for i in range(1,100):       
        try:
            loaded_models_ver['Version'+str(i)] = mlflow.sklearn.load_model('models:/'+model_type['name']+'/'+str(i))
        except Exception as e:
            loaded_models_ver['Version'+str(i)] = 'NONE'
            loaded_models_ver['latest_version'] = i-1
            break
    loaded_models[model_type['name']] = loaded_models_ver

~~ In addition, there is one part that is not good in the above code. This is the part where the version number is fixed at 1 to 100 in the inner for statement. I searched various APIs, but I couldn't find a way to get the latest version number of the model, so I had no choice but to count up the version numbers one by one and the latest version was one before the model load failed. I am trying to determine that. ** If anyone knows a smart way to get the latest version number of a model registered in Models, please let us know. ** ~~

(Added on 2020/1/5) I was taught how to search by search_model_versions ("name ='<name>' ") using MlflowClient (). Now you can get a smart list of version numbers. I will update the article if I can verify it later.

If you display loaded_models at this point, it will be in the following state.

From here, the model is called, the prediction is executed, and the result is recorded by mlflow. It is quite similar to the time of training, but the model to be used is sequentially replaced with the model called above and the prediction is executed.

# evaluation
for model_type in model_types:
    #Cutting out data for each model type
    df_mtype = df[df[model_types_column_name] == model_type['name']]
    
    #Splitting explanatory variables and objective variables
    X = df_mtype[feature_val_column_names]
    y = df_mtype[target_val_colmun_name]
   
    #Registration of evaluation results in mlflow
    mlflow_experiment_name = now_date+'Evaluation_' + model_type['name']
    mlflow.set_experiment(mlflow_experiment_name)
    
    for i in range(1,100):
        model = loaded_models[model_type['name']]['Version'+str(i)]
        if model == 'NONE': break
         
        #Accuracy to evaluation data
        predicted = model.predict(X)
        ac_score = accuracy_score(y, predicted)
        con_matrix = confusion_matrix(y, predicted)
        clf_report = classification_report(y, predicted)
    
        mlflow_run_name = 'Version'+str(i)
        with mlflow.start_run(run_name=mlflow_run_name) as run:
            #Record to run as a parameter
            mlflow.log_param('Version', i)
            mlflow.log_param('Model type name', model_type['name'])
            mlflow.log_param('Number of evaluation data', len(X))
            mlflow.log_metric('accuracy',ac_score) # log_I can't use Japanese for metric, but I use metric for comparison.

            #Output the details of the model training result to a file and register it in the artifact of run
            with open(eval_result_file_name, 'w') as f:
                print('Model type name:', model_type['name'], file=f)
                print('Model version:', 'Version'+str(i), file=f)                
                print('Objective variable:', target_val_colmun_name, file=f)
                print('Explanatory variable:', feature_val_column_names, file=f)
                print('Number of evaluation data:', len(X), '\n', file=f)
                print('Model accuracy(Evaluation data)', '\n', '------------------------------------------------', file=f)
                print('Confusion matrix(confusion_matrix) :', '\n', confusion_matrix(y,predicted), file=f)
                print('Objective variable label', np.sort(y.unique()), '\n', file=f)
                print('Correct answer rate(accuracy) :', '\n', ac_score, '\n', file=f)
                print('Accuracy report(classification_report) :', '\n', clf_report, '\n', file=f)
            mlflow.log_artifact(eval_result_file_name)

Evaluation results

This is the Experiments screen of mlflow after executing the evaluation code. The Experiments name is the evaluation date + model type. Run is recorded for each model version.

Here, if you check the Run you want to compare and click the "Compare" button,

The model version comparison screen is displayed. It is assumed that the person in charge of model operation looks at this screen and confirms the accuracy of multiple versions. If you click the accruracy link, the comparison screen will be displayed as a graph as shown below. Since the model created this time has the same training data for both Version 1 and 2, it has exactly the same accuracy, but in the production scene, you can visually compare the accuracy of each version here.

Operation model update

If you want to use the automatic update method, you can execute client.transition_model_version_stage at the end of the learning code to update the generated model as Production.

The following is for the manual update method. After the model operation staff confirms the model evaluation result on the mlflow screen, the model (Production) in operation is updated by manual operation on the mlflow screen.

On the Models screen, open the version of the model you want to produce, and select "Transition to Production" from the "Stage" pull-down menu on the upper right. A pop-up will appear, so just click OK. By the way, Archived is just a label name that means unnecessary, and the model will not be deleted immediately. Use client.delete_model_version to delete a model.

If you look at the Models list screen, you can see that Version 2 has changed to Production.

(Supplement) Online execution of model

For the Production model registered in Models, the so-called Online model execution environment is started by starting the service with the mlflow models serve command.

As a caveat, you must specify the tracking server in the environment variable MLFLOW_TRACKING_URI before running. https://mlflow.org/docs/latest/cli.html#mlflow-models

#Specify tracking server with environment variable
export MLFLOW_TRACKING_URI=http://localhost:5000

#Run the Production version of model name type1
mlflow models serve -m models:/type1/Production -h 0.0.0.0 -p 7001

This exposes the type1 Production model at the URL http: // IPaddress: port/Invocations. If you POST the prediction input data in JSON from the REST Client here, the prediction result will be obtained as a response.

Model prediction execution example from REST Client (Insomnia):

The input JSON data can be in the format of orient ='split' in to_json of the pandas data frame. Example: JSONize one record of training data

X.head(1).to_json(orient='split',index=False)
# {"columns":["sepal length (cm)","sepal width (cm)","petal length (cm)","petal width (cm)"],"data":[[4.9,3.0,1.4,0.2]]}

that's all.

Create a python machine learning model relearning mechanism with mlflow

Rough requirements for the environment you want to create

environment

Preparation

Introduced Anaconda

Introduction of mlflow

Introduction of sqlite3

Start mlflow tracking server

(option) Firewall settings

Access to mlflow tracking server

Model learning

Library import and constant definition

Definition of model type, objective variable, and explanatory variable

Reading training data

Model learning

Learning result (1st time)

Learning result (2nd time)

Model accuracy evaluation

Evaluation results

Operation model update

(Supplement) Online execution of model