Deploy and use the prediction model created in Python on SQL Server

content

[About this sample](#About this sample)

[Step 1. Environment construction](# step-1-Environment construction)

[Step 2. Create a predictive model using Python](# step-2-Create a predictive model using Python)

[Step 3. Deploying and using the predictive model on SQL Server](# step-3-Deploying and using the predictive model on sql-server)

About this sample

This sample runs machine learning on SQL Server 2017. The scenario is a forecast of future rentals in the ski rental business. The past rental history dataset shown below is used.

Year Month Day WeekDay Holiday Snow RentalCount
Year Month Day 曜Day 祝Dayフラグ Snowfall flag Number of rentals

** Software requirements: **

** Sample code: **

** Sample data: **

** Source: **

This article is based on Build a predictive model using Python and SQL Server ML Services.

Step 1. Environment construction

Build the environment to execute this sample.

1-1. Installation of SQL Server 2017

I tried using SQL Server 2017 In-Database Python for SQL Server 2017 Install Database Engine Services and Machine Learning Services (In-Database).

1-2. Restoring the sample DB

Sample DB [Backup File (TutorialDB.bak)](https://github.com/gho9o9/sql-server-samples/raw/master/samples/features/machine-learning-services/python/getting-started/rental -Download (prediction / TutorialDB.bak) and restore it by running Setup.sql. Setup.sql assumes that you have downloaded the backup file to C: \ Program Files \ Microsoft SQL Server \ MSSQL13.MSSQLSERVER \ MSSQL \ Backup. Change the path as appropriate according to your environment.

Setup.sql


USE master;
GO
RESTORE DATABASE TutorialDB
   FROM DISK = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Backup\TutorialDB.bak'
   WITH
   MOVE 'TutorialDB' TO 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\TutorialDB.mdf'
   ,MOVE 'TutorialDB_log' TO 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\TutorialDB.ldf';
GO

Check the data created by the restore.

T-SQL


USE TutorialDB;
SELECT * FROM [dbo].[rental_data];

1-3. Enabling the external script execution function

To run Python (and R) in SQL Server 2017, you need to change the setting of external scripts enabled in sp_configure. Also, the external scripts enabled parameter requires a restart of SQL Server 2017 for the setting changes to take effect.

T-SQL


EXEC sp_configure 'external scripts enabled', 1;

cmd


net stop "SQL Server Launchpad (MSSQLSERVER)"
net stop "SQL Server (MSSQLSERVER)"
net start "SQL Server (MSSQLSERVER)"
net start "SQL Server Launchpad (MSSQLSERVER)"

(*) Change the instance name according to the environment. Also, if there is a service that depends on the SQL Server service, such as the SQL Server Agent service, restart it explicitly.

Step 2. Creating a prediction model using Python

First, create a prediction model by Python in the Python IDE.

2-1. Importing the library

Import the required libraries.

rental_prediction.py


import os
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from revoscalepy import RxSqlServerData
from revoscalepy import rx_import

Imported library

Library Use
scikit-learn Used for machine learning
RevoScalePy UsedtoaccessSQLServer(Alsoavailableformachinelearning)

2-2. Loading data

Connect to SQL Server & get data and load it into panas dataframe.

rental_prediction.py


sql_server = os.getenv('PYTEST_SQL_SERVER', '.')
conn_str = 'Driver=SQL Server;Server=' + sql_server + ';Database=TutorialDB;Trusted_Connection=True;'
column_info = { 
        "Year" : { "type" : "integer" },
        "Month" : { "type" : "integer" }, 
        "Day" : { "type" : "integer" }, 
        "RentalCount" : { "type" : "integer" }, 
        "WeekDay" : { 
            "type" : "factor", 
            "levels" : ["1", "2", "3", "4", "5", "6", "7"]
        },
        "Holiday" : { 
            "type" : "factor", 
            "levels" : ["1", "0"]
        },
        "Snow" : { 
            "type" : "factor", 
            "levels" : ["1", "0"]
        }
    }

data_source = RxSqlServerData(sql_query="SELECT RentalCount, Year, Month, Day, WeekDay, Snow, Holiday FROM dbo.rental_data",
                              connection_string=conn_str, column_info=column_info)
    
# import data source and convert to pandas dataframe
df = pd.DataFrame(rx_import(data_source))
print("Data frame:", df)
# Get all the columns from the dataframe.
columns = df.columns.tolist()
# Filter the columns to remove ones we don't want.
columns = [c for c in columns if c not in ["Year"]]

(*) Change the connection destination server (variable: sql_server) according to the environment.

Results


Rows Processed: 453 
Data frame:      RentalCount  Year  Month  Day WeekDay Snow Holiday
0            445  2014      1   20       2    0       1
1             40  2014      2   13       5    0       0
2            456  2013      3   10       1    0       0
...
450           29  2015      3   24       3    1       0
451           50  2014      3   26       4    1       0
452          377  2015     12    6       1    1       0
[453 rows x 7 columns]

2-3. Model training

This sample uses a linear regression algorithm for model training. This model training is finding the function (model) that best describes the correlation of variables in the dataset.

rental_prediction.py


# Store the variable we'll be predicting on.
target = "RentalCount"
# Generate the training set.  Set random_state to be able to replicate results.
train = df.sample(frac=0.8, random_state=1)
# Select anything not in the training set and put it in the testing set.
test = df.loc[~df.index.isin(train.index)]
# Print the shapes of both sets.
print("Training set shape:", train.shape)
print("Testing set shape:", test.shape)
# Initialize the model class.
lin_model = LinearRegression()
# Fit the model to the training data.
lin_model.fit(train[columns], train[target])

Results


Training set shape: (362, 7)
Testing set shape: (91, 7)

2-4. Forecast

Predict with the model created earlier using the test data.

rental_prediction.py


# Generate our predictions for the test set.
lin_predictions = lin_model.predict(test[columns])
print("Predictions:", end="")
print(['{:.15f}'.format(n) for n in lin_predictions])
# Compute error between our test predictions and the actual values.
lin_mse = mean_squared_error(lin_predictions, test[target])
print("Computed error:", lin_mse)

Results


Predictions:['40.000000000000007', '38.000000000000007', '240.000000000000000', '39.000000000000000', '514.000000000000000', '48.000000000000007', '297.000000000000000', '24.999999999999993',
...
'432.000000000000000', '24.999999999999993', '39.000000000000007', '28.000000000000004', '325.000000000000000', '46.000000000000014', '36.000000000000014', '50.000000000000007', '63.000000000000007']
Computed error: 6.85182043392e-29

Step 3. Deploy and use the predictive model on SQL Server

SQL Server Machine Learning Services allows you to train and test predictive models in the context of SQL Server. Create a T-SQL program that contains embedded Python scripts that the SQL Server database engine handles. These Python codes run in SQL Server, which simplifies interaction with the data stored in the database.

3-1. Table definition

Define a table to store the model and forecast results.

rental_prediction.sql


-- 3-1.table definition
--Setup model table
DROP TABLE IF EXISTS rental_py_models;
GO
CREATE TABLE rental_py_models (
                model_name VARCHAR(30) NOT NULL DEFAULT('default model') PRIMARY KEY,
                model VARBINARY(MAX) NOT NULL
);
GO

--Create a table to store the predictions in
DROP TABLE IF EXISTS [dbo].[py_rental_predictions];
GO
CREATE TABLE [dbo].[py_rental_predictions](
	[RentalCount_Predicted] [int] NULL,
	[RentalCount_Actual] [int] NULL,
	[Month] [int] NULL,
	[Day] [int] NULL,
	[WeekDay] [int] NULL,
	[Snow] [int] NULL,
	[Holiday] [int] NULL,
	[Year] [int] NULL
) ON [PRIMARY]
GO

3-2. Modeling stored procedure definition

The model creation code created in Python is diverted and implemented in SQL Server. This stored procedure uses the data in SQL Server to create a linear regression model.

rental_prediction.sql


-- 3-2.Modeling stored procedure definition
-- Stored procedure that trains and generates an R model using the rental_data and a decision tree algorithm
DROP PROCEDURE IF EXISTS generate_rental_py_model;
go
CREATE PROCEDURE generate_rental_py_model (@trained_model varbinary(max) OUTPUT)
AS
BEGIN
    EXECUTE sp_execute_external_script
      @language = N'Python'
    , @script = N'

df = rental_train_data

# Get all the columns from the dataframe.
columns = df.columns.tolist()

# Store the variable well be predicting on.
target = "RentalCount"

from sklearn.linear_model import LinearRegression

# Initialize the model class.
lin_model = LinearRegression()

# Fit the model to the training data.
lin_model.fit(df[columns], df[target])

import pickle
#Before saving the model to the DB table, we need to convert it to a binary object
trained_model = pickle.dumps(lin_model)
'
    , @input_data_1 = N'select "RentalCount", "Year", "Month", "Day", "WeekDay", "Snow", "Holiday" from dbo.rental_data where Year < 2015'
    , @input_data_1_name = N'rental_train_data'
    , @params = N'@trained_model varbinary(max) OUTPUT'
    , @trained_model = @trained_model OUTPUT;
END;
GO

3-3. Model creation execution

Model Creation Registers the model created by executing the stored procedure in SQL Server as VARBINARY data.

rental_prediction.sql


-- 3-3.Model creation execution
TRUNCATE TABLE rental_py_models;

DECLARE @model VARBINARY(MAX);
EXEC generate_rental_py_model @model OUTPUT;

INSERT INTO rental_py_models (model_name, model) VALUES('linear_model', @model);

SELECT * FROM rental_py_models;

Results


model_name                     model
------------------------------ ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
linear_model                   0x800363736B6C6561726E2E6C696E6561725F6D6F64656C2E626173650A4C696E65617252656772657373696F6E0A7100298171017D71022858060000006E5F6A6F627371034B01580500000072616E6B5F71044B0758100000005F736B6C6561726E5F76657273696F6E71055806000000302E31382E31710658090000006E

(One line processed)

3-4. Predictive stored procedure definition

Predict the test data registered in SQL Server with the model registered in SQL Server.

rental_prediction.sql


-- 3-4.Predictive stored procedure definition
DROP PROCEDURE IF EXISTS py_predict_rentalcount;
GO
CREATE PROCEDURE py_predict_rentalcount (@model varchar(100))
AS
BEGIN
	DECLARE @py_model varbinary(max) = (select model from rental_py_models where model_name = @model);

	EXEC sp_execute_external_script 
					@language = N'Python'
				  , @script = N'

import pickle
rental_model = pickle.loads(py_model)

df = rental_score_data

# Get all the columns from the dataframe.
columns = df.columns.tolist()

# Store the variable well be predicting on.
target = "RentalCount"

# Generate our predictions for the test set.
lin_predictions = rental_model.predict(df[columns])

# Import the scikit-learn function to compute error.
from sklearn.metrics import mean_squared_error
# Compute error between our test predictions and the actual values.
lin_mse = mean_squared_error(lin_predictions, df[target])

import pandas as pd
predictions_df = pd.DataFrame(lin_predictions)  
OutputDataSet = pd.concat([predictions_df, df["RentalCount"], df["Month"], df["Day"], df["WeekDay"], df["Snow"], df["Holiday"], df["Year"]], axis=1)
'
	, @input_data_1 = N'Select "RentalCount", "Year" ,"Month", "Day", "WeekDay", "Snow", "Holiday"  from rental_data where Year = 2015'
	, @input_data_1_name = N'rental_score_data'
	, @params = N'@py_model varbinary(max)'
	, @py_model = @py_model
	with result sets (("RentalCount_Predicted" float, "RentalCount" float, "Month" float,"Day" float,"WeekDay" float,"Snow" float,"Holiday" float, "Year" float));
END;
GO

3-5. Prediction execution

Execute a predictive stored procedure to make a prediction and register the result in SQL Server.

rental_prediction.sql


--3-5.Predictive execution
TRUNCATE TABLE py_rental_predictions;
--Insert the results of the predictions for test set into a table
INSERT INTO py_rental_predictions
EXEC py_predict_rentalcount 'linear_model';
-- Select contents of the table
SELECT * FROM py_rental_predictions;

Results


RentalCount_Predicted RentalCount_Actual Month       Day         WeekDay     Snow        Holiday     Year
--------------------- ------------------ ----------- ----------- ----------- ----------- ----------- -----------
41                    42                 2           11          4           0           0           2015
360                   360                3           29          1           0           0           2015
19                    20                 4           22          4           0           0           2015
...
25                    26                 3           18          4           0           0           2015
28                    29                 3           24          3           1           0           2015
377                   377                12          6           1           1           0           2015

(151 lines processed)

Visualize the result registered in SQL Server with Power BI etc.

Recommended Posts

Deploy and use the prediction model created in Python on SQL Server
Deploy Python face recognition model on Heroku and use it from Flutter ②
Deploy Python face recognition model on Heroku and use it from Flutter ①
How to use the model learned in Lobe in Python
Install django on python + anaconda and start the server
Access the host SQL Server with python27 / pyodbc on the container
Sakura Use Python on the Internet
[In-Database Python Analysis Tutorial with SQL Server 2017] Step 6: Using the model
How to use is and == in Python
Use the LibreOffice app in Python (2) Manipulate calc (from macros and externals)
How to use the C library in Python
Run the task in the background on the sshed server
I want to use Python in the environment of pyenv + pipenv on Windows 10
Launch and use IPython notebook on the network
Use libsixel to output Sixel in Python and output a Matplotlib graph to the terminal.
About the difference between "==" and "is" in python
Use AppSync on the front and back ends
Solving the Lorenz 96 model with Julia and Python
Use the LibreOffice app in Python (3) Add library
Use python on Raspberry Pi 3 and turn on the LED when it gets dark!
[Python3] Take a screenshot of a web page on the server and crop it further
In SQLite3, add just two lines and use the extension library (extended SQL function)!
[CGI] Run the Python program on the server with Vue.js + axios and get the output data
Set up an FTP server that can be created and destroyed immediately (in Python)
Notes on how to use StatsModels that can use linear regression and GLM in python
It is easy to execute SQL with Python and output the result in Excel
The simplest Python memo in Japan (classes and objects)
[Azure] Deploy locally created ML model to the cloud
Carefully understand the exponential distribution and draw in Python
Dry-run sql query using psycopg2 on Redshift in Python
Include and use external Kv files in Python Kivy
Plot and understand the multivariate normal distribution in Python
Use the CASA Toolkit in your own Python environment
Carefully understand the Poisson distribution and draw in Python
% And str.format () in Python. Which one do you use?
Find the Hermitian matrix and its eigenvalues in Python
Run CGI written in python on Sakura's rental server
Ubuntu 20.04 on raspberry pi 4 with OpenCV and use with python
Let's use the open data of "Mamebus" in Python
Write a log-scale histogram on the x-axis in python
Execute the command on the web server and display the result
I want to use the R dataset in python
Use config.ini in Python
Use Valgrind in Python
Use profiler in Python
DNS server in Python ....
Use Python to monitor Windows and Mac and collect information on the apps you are working on
[C / C ++] Pass the value calculated in C / C ++ to a python function to execute the process, and use that value in C / C ++.
[Python] When you want to import and use your own package in the upper directory
[In-Database Python Analysis Tutorial with SQL Server 2017] Step 5: Training and saving a model using T-SQL
Put Ubuntu in Raspi, put Docker on it, and control GPIO with python from the container
Save the pystan model and results in a pickle file
Display "Hello World" created in the local environment on the web
Survey on the use of machine learning in real services
How to use the __call__ method in a Python class
[Hyperledger Iroha] Notes on how to use the Python SDK
Get the MIME type in Python and determine the file format
Sort and output the elements in the list as elements and multiples in Python.
How to deploy the easiest python textbook pybot on Heroku
Install pyenv on MacBook Air and switch python to use
Manipulate the clipboard in Python and paste the table into Excel