・ Run azure machine learning experiment with jupyter notebook ・ Creation of azure workspace is omitted. -Operating environment is macOS
1-1. Create virtual environment conda -n create virtual environment name
1-2. Activate the virtual environment conda activate virtual environment name
1-3. Start upyter notebook jupyter notebook
2-1. Install the Azure ml package
pip install azureml
,pip install sklearn
2-2. Import Azure workspace
At this time, download the config file in the Azure workspace and store it in the same folder as the ipynob file.
from azureml.core.workspace import Workspace
ws = Workspace.from_config()
%matplotlib inline
import matplotlib.pyplot as plt
import sklearn
from sklearn import preprocessing, metrics, model_selection
from sklearn.preprocessing import MinMaxScaler, StandardScaler, LabelEncoder, OneHotEncoder, LabelBinarizer
from sklearn.model_selection import KFold, StratifiedKFold, GridSearchCV, train_test_split
from datetime import datetime, date, timezone, timedelta
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os, gc
PATH = '/Users/〇〇/Desktop/jupyter/〇〇/'
##File reading
data = pd.read_csv(PATH+'〇〇.csv')
##Divided into learning, validation and test data
from sklearn.model_selection import train_test_split
train_data, test_data = train_test_split(data, test_size=0.4 ,shuffle=False)
train_data, validation_data = train_test_split(train_data, train_size=0.66 ,shuffle=False)
##Delete unnecessary data
no_label = "CO"
train_data = train_data.drop(no_label,axis=1)
test_data = test_data.drop(no_label,axis=1)
validation_data = validation_data.drop(no_label,axis=1)
##Select target column
label = "NOX"
test_labels = test_data.pop(label).values
Shows the model creation program.
This time I wanted to make the task a regression, so specify task ='regression'
('Classification' for classification tasks,'forecasting' for time series analysis)
In addition, it is attractive that you can change settings that cannot be changed with the Azure GUI.
By the way, I wanted to specify data division, so I am running it from jupyter notebook.
(By default, cross-validation of the number of divisions according to the number of rows)
from azureml.train.automl import AutoMLConfig
automl_config = AutoMLConfig(task='regression',
primary_metric='r2_score',
experiment_timeout_minutes=60,
training_data=train_data,
label_column_name=label,
validation_data = validation_data,
debug_log='automated_ml_errors.log')
Start of experiment (model creation and verification)
from azureml.core.experiment import Experiment
experiment = Experiment(ws, "〇〇")
local_run = experiment.submit(automl_config, show_output=True)
Please enter any experiment name in 〇〇
Since there are few sites that can be referenced in the settings when creating a model and I have spent time, I wrote an article with code as an example so that those who want to implement the same thing can do it immediately. I also had a little trouble with library versioning and passing the python import path. There are still many things I don't understand as an engineer, but I will study little by little. ^ ^
microsoft documentation ・ Https://docs.microsoft.com/ja-jp/azure/machine-learning/how-to-configure-cross-validation-data-splits ・ Https://docs.microsoft.com/ja-jp/azure/machine-learning/how-to-auto-train-forecast
Blog article ・ Https://www.simpletraveler.jp/2019/12/08/tried-azuremachinelearning-on-local-jupyter/
Recommended Posts