I've been modeling and deploying several times with ML Studio classic. This time, I read the official document because I wanted to relearn it, but I didn't understand it well and went back and forth. I've managed to do it, so I'll write an article so I don't forget it. Also, since I am writing for those who use it for the first time, I have omitted the details, so please forgive me.
is.
Since Python will be used on the way, write the environment and the imported one.
name |
---|
urllib |
azure-storage-blob |
Let's start with the data we will use this time. I want to know if I was able to relearn, so I will use data that is too monotonous. Write the training data, test data, and remodeling data below.
train1.csv
id,target
1,1
2,1
3,1
4,1
5,1
remodel0.csv
id,target
1,0
2,0
3,0
4,0
5,0
In the created model, expect a model that always returns 1 no matter how many you enter. So even if you put in the test data, it should all be returned as 1. We expect 0 to be returned by learning that everything is 0.
It is not a good model at all, so I will use it only for testing.
Now let's create a model.
After logging in to ML Studio, select ʻEXPERIMENTSand click
NEW` at the bottom of the screen.
Then click Blank Experiment
.
Now you are ready to create a predictive model.
Let's actually make a model. Simply search for the required block from this search window and add it.
This time it is a binary term (for the time being), so place the box created by searching with two classes. This time, I used Two-Class Boosted Decision Tree
.
Search for other blocks in the same way and arrange them as shown in the image.
Next, set each block. Click on the block Setting items will appear on the right side of the page. Parameters can be set in the algorithm block. This time you can leave it as it is.
Next is import Data. Here, specify the data to be used for learning.
Write your blob username, Key, and file path.
The format of this data is csv, but since the file has a header, let's check File has header row
.
Finally, the Train Model. Here, we will specify the column name of the target to be learned.
This time I want to learn target
, so I will fill in target
with Launch column selector
.
When you're done so far, let's connect the blocks with a line.
Please note that the Train Model cannot be connected by lines on the left and right sides.
If you can arrange it as below, click RUN
at the bottom of the screen to execute it.
When all the boxes are checked, you're done. If you stop in the middle, there must be something wrong. The blob file name may be different (experience story)
If all are checked, the model is complete!
Now, let's make something that will return an answer when you throw data using the model you created earlier.
Click Predictiv web Service
from SET UP WEB SERVICE
next to RUN
.
Then the box will move and look like the one below. Bring ʻExport Data` from the search window and write the blob user name etc. The test result will be output to the path written here.
Let's RUN
again.
If it is checked as before, it is complete.
Click DEPLOY WEB SERVICE
when you are done.
After a while, the screen will change and API KEY etc. will be displayed.
Clicking on REQUEST / RESPONSE
will bring up the API, which you can expect to use.
You can also do a simple test on this page.
Let's experiment by clicking the blue TEST button and putting 3 in id and 1 in target.
After a while, the result will be returned at the bottom of the page.
The column name, column type, and value are returned in a list.
return
Result: {"Results":{"output1":{"type":"table","value":{"ColumnNames":["id","target","Scored Labels","Scored Probabilities"],"ColumnTypes":["Int32","Int32","Int32","Double"],"Values":[["3","1","1","0.142857149243355"]]}}}}
Since Scored Labels
is the result, 1 is returned in this case, which is as expected.
Now click View latest
to return to the previous screen.
Then go to the Training expepriment
tab and add Web service input
and Web service output
to RUN
.
Then click Deploy Web Service
from SET UP WEB SERVICE
below.
Deployment is now complete.
You will need the API Key and API URL later.
The API Key is what you see on the screen. The URL will appear when you click BATCH EXECTION
.
The URL of the API to be used is before ? Api-version…
, that is, the URL ending with job
.
I stumbled here.
Let's relearn. The re-learning itself was done in python. C # is also a sample, but it may not work unless you change it in some places. (That was the case with python3.7.)
First, the program you actually use.
retrain.py
# python 3.7 so change urllib2
import urllib
import urllib.request
import json
import time
from azure.storage.blob import *
def printHttpError(httpError):
print(f"The request failed with status code: {str(httpError.code)}")
print(json.loads(httpError.read()))
return
def processResults(result):
results = result["Results"]
for outputName in results:
result_blob_location = results[outputName]
sas_token = result_blob_location["SasBlobToken"]
base_url = result_blob_location["BaseLocation"]
relative_url = result_blob_location["RelativeLocation"]
print(f"The results for {outputName} are available at the following Azure Storage location:")
print(f"BaseLocation: {base_url}")
print(f"RelativeLocation: {relative_url}")
print(f"SasBlobToken: {sas_token}")
return
def uploadFileToBlob(input_file, input_blob_name, storage_container_name, storage_account_name, storage_account_key):
#It seems that there is no BlobService, so change it
blob_service = BlockBlobService(account_name=storage_account_name, account_key=storage_account_key)
print("Uploading the input to blob storage...")
blob_service.create_blob_from_path(storage_container_name, input_blob_name, input_file)
def invokeBatchExecutionService():
storage_account_name = "blob username"
storage_account_key = "blob key"
storage_container_name = "blob container name"
connection_string = f"DefaultEndpointsProtocol=https;AccountName={storage_account_name};AccountKey={storage_account_key}"
api_key = "Re-learning API Key"
url = "API URL"
uploadFileToBlob("File path to upload",
"File path after upload",
storage_container_name, storage_account_name, storage_account_key)
payload = {
"Inputs": {
"input1": {
"ConnectionString": connection_string,
"RelativeLocation": f"/{storage_container_name}/File path of blob to be remodeled"
},
},
"Outputs": {
"output1": {
"ConnectionString": connection_string,
"RelativeLocation": f"/{storage_container_name}/Remodeled blob file path.ilearner"
},
},
"GlobalParameters": {
}
}
body = str.encode(json.dumps(payload))
headers = { "Content-Type":"application/json", "Authorization":("Bearer " + api_key)}
print("Submitting the job...")
req = urllib.request.Request(url + "?api-version=2.0", body, headers)
response = urllib.request.urlopen(req)
result = response.read()
job_id = result[1:-1]
# job_I was angry because id was not str, so I converted it
job_id=job_id.decode('utf-8')
print(f"Job ID: {job_id}")
print("Starting the job...")
headers = {"Authorization":("Bearer " + api_key)}
req = urllib.request.Request(f"{url}/{job_id}/start?api-version=2.0", headers=headers, method="POST")
response = urllib.request.urlopen(req)
url2 = url + "/" + job_id + "?api-version=2.0"
while True:
print("Checking the job status...")
req = urllib.request.Request(url2, headers = { "Authorization":("Bearer " + api_key) })
response = urllib.request.urlopen(req)
result = json.loads(response.read())
status = result["StatusCode"]
if (status == 0 or status == "NotStarted"):
print(f"Job: {job_id} not yet started...")
elif (status == 1 or status == "Running"):
print(f"Job: {job_id} running...")
elif (status == 2 or status == "Failed"):
print(f"Job: {job_id} failed!")
print("Error details: " + result["Details"])
break
elif (status == 3 or status == "Cancelled"):
print(f"Job: {job_id} cancelled!")
break
elif (status == 4 or status == "Finished"):
print(f"Job: {job_id} finished!")
processResults(result)
break
time.sleep(1) # wait one second
return
invokeBatchExecutionService()
It's almost the same as the sample, but some changes have been made. (I wrote it in the comments.) Please rewrite the URL, Key, PATH, etc. according to each environment. Then prepare the data and execute it.
console
>python remodel.py
Uploading the input to blob storage...
Submitting the job...
Job ID: ID
Starting the job...
Checking the job status...
JobID not yet started...
Checking the job status...
JobID running...
Checking the job status...
JobID running...
Checking the job status...
JobID running...
Checking the job status...
JobID running...
Checking the job status...
JobID running...
Checking the job status...
JobID running...
Checking the job status...
JobID running...
Checking the job status...
JobID finished!
The results for output1 are available at the following Azure Storage location:
BaseLocation: URL
RelativeLocation: PATH
SasBlobToken: KEY
You should get the result as above. The following three will be used later.
Well, I executed the remodeling, but it has not been reflected yet.
Open New Web Services Experience
from this screen.
At this time, if [predictive exp.]
Is not added after the model name at the top of the page, click ʻExperiment created on… [Predictive Exp.]` To move it.
I think it's okay to overwrite the existing one, but most of the time I'll keep it.
So create a new endpoint.
Press the left button from this screen ...
(If you don't mind that the arrows are messy, you lose.)
Click + NEW
and save the endpoint with the name you want to use.
Clicking on the created endpoint name will take you to a screen like the one above, so open the Consume
tab.
When you open it, various KEYs and URLs will appear.
This time we will use Primary Key
and` Patch. ``
Use these to execute the following code.
update.py
import urllib
import urllib.request
import json
data = {
"Resources": [
{
"Name": "Model name",
"Location":
{
"BaseLocation": "Result URL",
"RelativeLocation": "Result PATH",
"SasBlobToken": "Result KEY"
}
}
]
}
body = str.encode(json.dumps(data))
url = "Patch value"
api_key = "Primary Key value" # Replace this with the API key for the web service
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}
req = urllib.request.Request(url, body, headers)
req.get_method = lambda: 'PATCH'
response = urllib.request.urlopen(req)
result = response.read()
print(result)
It is like this. If you open API help under Patch on the previous page, there is a sample, so I think that you should write it as it is. Let's run it.
b''
is returned.
I see?
But this seems to be fine.
Finally deploy this.
The deployment procedure is
Let's test by opening the Test
tab at the top of the page.
Maybe you need to reload.
id = 3, target = 1.
It seems that it is properly set to 0.
Was good…
I managed to deploy the retrained model. Is the accuracy of the model you are using decreasing? If you think so, relearn and continue using the model!
Recommended Posts