Introduction

SageMaker is a service that provides a complete set of machine learning workloads. Using the data stored in S3 etc., it provides all the functions required for machine learning projects such as model development with Jupyter notebook, code management with Git repository, training job creation, hosting of inference endpoints. I will.

I read "Deploying a PyTorch model for large-scale inference using TorchServe" on the Amazon Web Services blog and tried hosting the model using Amazon SageMaker. Below, we will introduce the procedure and the story around it.

Please see this article for the transformation part of the model.

procedure

Creating an S3 bucket

First, create a bucket in S3. This time I created a bucket named torchserve-model. The region is "Asia Pacific (Tokyo)" and everything except the name is the default.

Notebook instance creation

When you open the Amazon SageMaker console, you'll see a menu in the left pane.

Select Notebook Instance from the Notebook menu and click Create Notebook Instance. Set the following items for instance settings, and set the others as default.

--Notebook instance settings --Notebook instance name: sagemaker-sample --Permissions and encryption --IAM Role: Create a new role

On the IAM role creation screen, specify the S3 bucket you created earlier. スクリーンショット 2020-07-16 14.02.50.png

After entering the settings, click Create Notebook Instance. You will be returned to the notebook instance screen, so click the name of the created instance to enter the details screen. From the IAM role ARN link, open the IAM screen, click "Attach Policy", and attach the "Amazon EC2ContainerRegistryFullAccess" policy. This is the policy you will need to work with ECR later.

When the status becomes In service, start JupyterLab with "Open JupyterLab".

First, start Terminal from Other of Laucher.

sh-4.2$ ls
anaconda3  Nvidia_Cloud_EULA.pdf  sample-notebooks             tools
examples   README                 sample-notebooks-1594876987  tutorials
LICENSE    SageMaker              src
sh-4.2$ ls SageMaker/
lost+found

The explorer on the left side of the screen displays the files under SageMaker /.

Git is also installed.

sh-4.2$ git --version
git version 2.14.5

In the following we will create a notebook and host the model, but you can do the same with the tutorial notebook. You can clone the sample code with SageMaker /.

sh-4.2$ cd SageMaker
sh-4.2$ git clone https://github.com/shashankprasanna/torchserve-examples.git

All the steps are described in deploy_torchserve.ipynb. When you open your notebook, you will be asked which Python kernel to use, so select conda_pytorch_p36.

Model host

First, create a new folder from the folder button in the left pane, and double-click to enter the created folder. Then create a notebook.

Select the notebook with conda_pytorch_p36. Rename the notebook to deploy_torchserve.ipynb.

Perform an installation of the library that transforms the Pytorch model for deployment in the cell.

`deploy_torchserve.ipynb`


!git clone https://github.com/pytorch/serve.git
!pip install serve/model-archiver/

This time we will host the densenet161 model. Download the trained weights file. Also, since the sample model class is included in the library cloned earlier, use the weight file and class to convert it to the hosted format.

`deploy_torchserve.ipynb`


!wget -q https://download.pytorch.org/models/densenet161-8d451a50.pth

`deploy_torchserve.ipynb`


model_file_name = 'densenet161'
!torch-model-archiver --model-name {model_file_name} \
--version 1.0 --model-file serve/examples/image_classifier/densenet_161/model.py \
--serialized-file densenet161-8d451a50.pth \
--extra-files serve/examples/image_classifier/index_to_name.json \
--handler image_classifier

When executed, densenet161.mar will be output to the current directory.

Store the created file in S3.

`deploy_torchserve.ipynb`


#Create a boto3 session to get region and account information
import boto3, time, json
sess    = boto3.Session()
sm      = sess.client('sagemaker')
region  = sess.region_name
account = boto3.client('sts').get_caller_identity().get('Account')

import sagemaker
role = sagemaker.get_execution_role()
sagemaker_session = sagemaker.Session(boto_session=sess)

#By the way, the contents are as follows.
# print(region, account, role)
# ap-northeast-1
# xxxxxxxxxxxx 
# arn:aws:iam::xxxxxxxxxxxx:role/service-role/AmazonSageMaker-ExecutionRole-20200716T140377

`deploy_torchserve.ipynb`


#Specify the Amazon SageMaker S3 bucket name
bucket_name = 'torchserve-model'
prefix = 'torchserve'

# print(bucket_name, prefix)
# sagemaker-ap-northeast-1-xxxxxxxxxxxx torchserve

`deploy_torchserve.ipynb`


#Amazon SageMaker has a tar model.Since it is assumed to be in the gz file, densenet161.Compressed tar from mar file.Create a gz file.
!tar cvfz {model_file_name}.tar.gz densenet161.mar

`deploy_torchserve.ipynb`


#Upload your model to an S3 bucket under your model's directory.
!aws s3 cp {model_file_name}.tar.gz s3://{bucket_name}/{prefix}/models/

Then create the container registry with ECR.

`deploy_torchserve.ipynb`


registry_name = 'torchserve'
!aws ecr create-repository --repository-name torchserve

# {
#     "repository": {
#         "repositoryArn": "arn:aws:ecr:ap-northeast-1:xxxxxxxxxxxx:repository/torchserve",
#         "registryId": "xxxxxxxxxxxx:repository",
#         "repositoryName": "torchserve",
#         "repositoryUri": "xxxxxxxxxxxx:repository.dkr.ecr.ap-northeast-1.amazonaws.com/torchserve",
#         "createdAt": 1594893256.0,
#         "imageTagMutability": "MUTABLE",
#         "imageScanningConfiguration": {
#             "scanOnPush": false
#         }
#     }
# }

Once away from the notebook, click the "+" button in the left pane and select "Text File" from Launcher to create a Docker file.

`Dockerfile`


FROM ubuntu:18.04

ENV PYTHONUNBUFFERED TRUE

RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
    fakeroot \
    ca-certificates \
    dpkg-dev \
    g++ \
    python3-dev \
    openjdk-11-jdk \
    curl \
    vim \
    && rm -rf /var/lib/apt/lists/* \
    && cd /tmp \
    && curl -O https://bootstrap.pypa.io/get-pip.py \
    && python3 get-pip.py

RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1
RUN update-alternatives --install /usr/local/bin/pip pip /usr/local/bin/pip3 1

RUN pip install --no-cache-dir psutil \
                --no-cache-dir torch \
                --no-cache-dir torchvision
                
ADD serve serve
RUN pip install ../serve/

COPY dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh
RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh

RUN mkdir -p /home/model-server/ && mkdir -p /home/model-server/tmp
COPY config.properties /home/model-server/config.properties

WORKDIR /home/model-server
ENV TEMP=/home/model-server/tmp
ENTRYPOINT ["/usr/local/bin/dockerd-entrypoint.sh"]
CMD ["serve"]

The contents of Dockerfike are set as follows.

--PYTHONUNBUFFERED TRUE prevents stdout and stderr from buffering. --If you set DEBIAN_FRONTEND = noninteractive, No interactive settings. ----no-install-recommends is not required, do not install recommended packages. --ʻUpdate-alternatives` [changes priority] for Python and pip to use (https://codechacha.com/en/change-python-version/).

Create dockerd-entrypoint.sh and config.properties as well.

`dockerd-entrypoint.sh`


#!/bin/bash
set -e

if [[ "$1" = "serve" ]]; then
    shift 1
    printenv
    ls /opt
    torchserve --start --ts-config /home/model-server/config.properties
else
    eval "$@"
fi

# prevent docker exit
tail -f /dev/null

The following code is written for the shell script.

-- set -e: If there is an error, the shell script will be stopped there. --$ 1: This is the first argument. --shift 1: Shifts the order of the arguments. This allows you to pass arguments to the next command as if they were given from the beginning. --printenv: Print the contents of environment variables. * It will be output to CloudWatch logs, which will be introduced later. --ʻEval "$ @" : Expand the argument as a command and execute that command. Used when executing commands other than serve. --tail -f / dev / null`: Dummy command to keep the container running.

`config.properties`


inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
number_of_netty_threads=32
job_queue_size=1000
model_store=/opt/ml/model
load_models=all

It is a supplement about the setting. See here for more information.

--number_of_netty_threads: The total number of threads on the front end, defaulting to the number of logical processors available in the JVM. --job_queue_size: The number of inference jobs that the front end queues before the back end serves, defaults to 100. --model_store: Model storage location. * When using SageMaker, the model is stored from S3 in / opt / ml / model /. --load_models: Same effect as –models at startup. Specify the model to deploy. When ʻall, deploy all the models stored in model_store`.

Create a container image and store it in the registry. v1 is the image tag, and ʻimageis the image name including the tag. When using ECR, give an image name according to the rules of <registry name> / <image name>: <tag>. <Registry name> matches the return valuerepositoryUri` when the registry was created.

The build took about 15 minutes.

`deploy_torchserve.ipynb`


image_label = 'v1'
image = f'{account}.dkr.ecr.{region}.amazonaws.com/{registry_name}:{image_label}'

# print(image_label, image)
# v1 xxxxxxxxxxxx.dkr.ecr.ap-northeast-1.amazonaws.com/torchserve:v1

`deploy_torchserve.ipynb`


!docker build -t {registry_name}:{image_label} .
!$(aws ecr get-login --no-include-email --region {region})
!docker tag {registry_name}:{image_label} {image}
!docker push {image}

# Sending build context to Docker daemon  399.7MB
# Step 1/16 : FROM ubuntu:18.04
# 18.04: Pulling from library/ubuntu

# 5296b23d: Pulling fs layer 
# 2a4a0f38: Pulling fs layer 
# ...
# 9d6bc5ec: Preparing 
# 0faa4f76: Pushed   1.503GB/1.499GBv1: digest: 
# sha256:bb75ec50d8b0eaeea67f24ce072bce8b70262b99a826e808c35882619d093b4e size: 3247

It's finally time to host the inference endpoint. Create a model to deploy with the following code.

`deploy_torchserve.ipynb`


import sagemaker
from sagemaker.model import Model
from sagemaker.predictor import RealTimePredictor
role = sagemaker.get_execution_role()

model_data = f's3://{bucket_name}/{prefix}/models/{model_file_name}.tar.gz'
sm_model_name = 'torchserve-densenet161'

torchserve_model = Model(model_data = model_data,
                        image = image,
                        role = role,
                        predictor_cls=RealTimePredictor,
                        name = sm_model_name)

Deploy the endpoint with the following code. It took about 5 minutes to deploy.

`deploy_torchserve.ipynb`


endpoint_name = 'torchserve-endpoint-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
predictor = torchserve_model.deploy(instance_type='ml.m4.xlarge',
 initial_instance_count=1,
 endpoint_name = endpoint_name)

You can see the progress of the deployment in Cloud Watch logs. You can view the list of endpoints by opening the CloudWatch console, clicking Log Groups in the left pane, and typing / aws / sagemaker / Endpoints in the search bar.

You can see the deployment log by clicking to open the details screen and checking the log in Log Stream.

If the deployment is not successful, I think it is outputting an Error. By the way, if an error occurs, it will continue to retry for about an hour to redeploy, so if you think something is wrong, you should check the log as soon as possible.

The author wasted time by leaving the deployment for a long time.

Make a request to see if it's working properly.

`deploy_torchserve.ipynb`


!wget -q https://s3.amazonaws.com/model-server/inputs/kitten.jpg 
file_name = 'kitten.jpg'
with open(file_name, 'rb') as f:
    payload = f.read()
    payload = payload

response = predictor.predict(data=payload)
print(*json.loads(response), sep = '\n')

# {'tiger_cat': 0.4693359136581421}
# {'tabby': 0.4633873701095581}
# {'Egyptian_cat': 0.06456154584884644}
# {'lynx': 0.001282821292988956}
# {'plastic_bag': 0.00023323031200561672}

If you can get the predictor instance, you can make a request by the above method, but if you make a request from the outside, you need SDK. Open a Python interactive shell on an external PC and try making a request using boto3.

$ !wget -q https://s3.amazonaws.com/model-server/inputs/kitten.jpg 
$ python

>>> import json
>>> import boto3
>>> endpoint_name = 'torchserve-endpoint-2020-07-16-13-16-12'
>>> file_name = 'kitten.jpg'
>>> with open(file_name, 'rb') as f:
...     payload = f.read()
...     payload = payload
>>> client = boto3.client('runtime.sagemaker',
        aws_access_key_id='XXXXXXXXXXXXXXXXXXXX',
        aws_secret_access_key='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
        region_name='ap-northeast-1')
>>> response = client.invoke_endpoint(EndpointName=endpoint_name, 
...                                    ContentType='application/x-image', 
...                                    Body=payload)
>>> print(*json.loads(response['Body'].read()), sep = '\n')
{'tiger_cat': 0.4693359136581421}
{'tabby': 0.4633873701095581}
{'Egyptian_cat': 0.06456154584884644}
{'lynx': 0.001282821292988956}
{'plastic_bag': 0.00023323031200561672}

I was able to confirm that the response was returned correctly.

You can also check the deployed model, deployment settings, and endpoint information from the console. スクリーンショット 2020-07-16 22.47.36.png スクリーンショット 2020-07-16 22.47.41.png スクリーンショット 2020-07-16 23.59.20.png

in conclusion

How was it (laughs)? SageMaker is very convenient. Wouldn't it be a lot easier if you were hosting a bit of inference on the backend? If you want to customize the interface, More flexible customization seems to be possible, but Since TorchServe can be served by other than SageMaker (previous article), it seems better to develop it according to the TorchServe format for reuse on AWS.

[DOCKER] I tried hosting Pytorch's deep learning model using TorchServe on Amazon SageMaker

Introduction

procedure

Creating an S3 bucket

Notebook instance creation

Model host

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

Dockerfile

dockerd-entrypoint.sh

config.properties

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

deploy_torchserve.ipynb

in conclusion

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`Dockerfile`

`dockerd-entrypoint.sh`

`config.properties`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`

`deploy_torchserve.ipynb`