Introduction

Hello. This is Yabuki from NTT DoCoMo. My team is streaming large-scale data from DoCoMo and using AWS to develop the system. Therefore, the development environment alone is quite expensive. However, frequent shutdowns and startups of servers and databases to save money can be tedious and easy to forget. Therefore, it would be convenient if Alexa could be used to control ** "Alexa, stop the database in the development environment" **. By the way, if you can save money with this, you can also sell your favor to your boss who is holding his head saying, "Recently, the cost is much higher than expected, Akan ...". So, this time, if an amateur who has never touched Lambda or Alexa Skills Kit is studying, I will work on ** skill development that controls the start / stop of AWS resources with Alexa **. In addition, opinions such as "You should write a cron that starts at the start time of work on weekdays and stops at the end time" and "Develop without a server" are reasonable, but we will not accept them here.

Goal

"Alexa, start a web server in the development environment" "Alexa, shut down the database in the development environment" With such a feeling, AWS resource start / stop control is performed only by voice.

Things to prepare

-Amazon Developer Account -AWS Account

Target person

--People who want to get a rough idea of the flow of Alexa skill development --People who want to control AWS resources (start / stop EC2 and RDS) with Alexa

Referenced materials

I have referred to the following materials very much.

-Alexa Skills Kit (ASK) Document

Alexa Skills Kit SDK for Python -Suppress the basic concepts necessary for developing Skills for Amazon Echo (Alexa) -Implement Alexa skills in Python/Lambda

Implementation

To create an Alexa skill, you need to create a voice input interface and implement a backend that processes according to the request content. The interface is created by operating the Web screen in the Alexa Developer Console. I would like to implement the backend in Python and run it in Lambda.

Creating an interface

First, let's create an interface in the Alexa developer console. This time I want to host with Lambda of AWS account with Japanese skill, so select as shown in the figure below and create the skill. スクリーンショット 2020-12-14 23.33.22.png

Select Scratch for the template. スクリーンショット 2020-12-14 23.42.11.png

Now that the basic template is created, set the call name, Intent, Slot, etc.

Invocation Name Set the Invocation Name (keyword when calling) to respond to skills created by Alexa. I set "development environment" as a keyword because I want to call it like "stop the web server in the development environment". スクリーンショット 2020-12-14 23.46.49.png

Intent Next, create an Intent. In the Documentation (https://developer.amazon.com/ja-JP/docs/alexa/ask-overviews/alexa-skills-kit-glossary.html#i), the Intent is described as follows:

It's a little difficult to understand, but as a result of interpreting it by touching it myself, I thought it was a function to recognize a voice request for a certain purpose **. You need to define an utterance sample so that the Intent can correctly recognize the voice request. You can think of utterances such as "stop the web server" and "start the database", but it is difficult to cover all of them. So {resource}To{action}do it It would be nice to be able to give arguments to the utterance sample like this. This argument is called a Slot. This time, I created an Intent named ResouceControl as shown in the figure below. The recognition rate of utterance samples will increase if you make many possible variations. スクリーンショット 2020-12-15 9.46.20.png

Slot Next, create the Slot described earlier. First, go to the Slot Type tab and define `` `resource``` as follows. We will be able to control this value by associating it with the resource ID in a later backend implementation. スクリーンショット 2020-12-15 10.15.18.png

Then define `action`. This time I want to start and stop the resource, so I did the following. If you also register synonyms, it will be more versatile. スクリーンショット 2020-12-15 10.16.04.png

Then, go back to the Intent tab again and associate the Slot Type you just defined with the Intent Slot.

This completes the interface implementation. Click the Save Model and Build Model buttons at the top of the page to save and build the model. It's very easy.

Backend implementation

Next, we will implement the backend. This time around, I'd like to try the recently announced Lambda container image support. The folder structure is as follows.

alexa/
├── Dockerfile
├── app.py
└── resource.json

Use the Python image for Lambda provided by AWS. Only the library for Alexa skill development (ask-sdk) is additionally installed. Also, make sure that the handler is called after startup.

`Dockerfile.`


FROM public.ecr.aws/lambda/python:3.8

RUN pip3 install --upgrade pip && \
    pip3 install ask-sdk==1.15.0

COPY  app.py resource.json ./

CMD ["app.handler"]

Enter the name and ID of the resource you want to control as shown below. Check the ID of each resource from the AWS console, etc. and enter it. This file is read and used in the logic part.

`resource.json`


{
  "Web server": "your_web_server_id" ,
  "api server": "your_api_server_id" ,
  "Database": "your_db_cluster_id"
}

Next is the logic part. It is based on a copy and paste of the Official Document code. The implementation flow is the processing to be performed when LaunchRequest (request with only call name), IntentRequest (request with Intent such as custom Intent defined earlier and built-in CancelAndStopIntent), SessionEndedRequest (request to end conversation), etc. are called. We will implement the contents that make Alexa speak.

`app.py`


import json
from ask_sdk_core.dispatch_components import AbstractRequestHandler
from ask_sdk_core.dispatch_components import AbstractExceptionHandler
from ask_sdk_core.handler_input import HandlerInput
from ask_sdk_core.skill_builder import SkillBuilder
from ask_sdk_core.utils import get_slot_value_v2, is_intent_name, is_request_type
from ask_sdk_model import Response
from ask_sdk_model.ui import SimpleCard
import boto3


sb = SkillBuilder()


def get_resource_id(resource_name):
    with open('resource.json') as f:
        resource_list = json.load(f)
    return resource_list[resource_name]


class LaunchRequestHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        # type: (HandlerInput) -> bool
        return is_request_type('LaunchRequest')(handler_input)

    def handle(self, handler_input):
        # type: (HandlerInput) -> Response
        speech_text = 'Which AWS resource to launch/Do you want to stop?'

        handler_input.response_builder.speak(speech_text).set_card(
            SimpleCard('AWS', speech_text)).set_should_end_session(
            False)
        return handler_input.response_builder.response


class ResourceControlHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):  # type: (HandlerInput) -> bool
        return is_intent_name('ResourceControl')(handler_input)

    def handle(self, handler_input):  # type: (HandlerInput) -> Union[None, Response]
        action = get_slot_value_v2(handler_input=handler_input, slot_name='action').value
        resource_name = get_slot_value_v2(handler_input=handler_input, slot_name='resource').value
        print(f'action: {action}')
        print(f'resource_name: {resource_name}')

        start_message = f'{resource_name}Started'
        already_started_message = f'{resource_name}Is already running'
        stop_message = f'{resource_name}Stopped'
        already_stopped_message = f'{resource_name}Is already stopped'
        end_session = True

        if resource_name in ['Web server', 'api server']:
            ec2 = boto3.client('ec2')
            ec2_status = ec2.describe_instances(InstanceIds=[get_resource_id(resource_name)])\
                ["Reservations"][0]["Instances"][0]['State']['Name']
            if action == 'Start-up':
                if ec2_status == 'running' or ec2_status == 'pending':
                    speech_text = already_started_message
                else:
                    ec2.start_instances(InstanceIds=[get_resource_id(resource_name)])
                    speech_text = start_message
            elif action == 'Stop':
                if ec2_status == 'stopping' or ec2_status == 'stopped':
                    speech_text = already_stopped_message
                else:
                    ec2.stop_instances(InstanceIds=[get_resource_id(resource_name)])
                    speech_text = stop_message
            else:
                speech_text = f'{resource_name}What do you do? Please say it again'
                end_session = False
        elif resource_name == 'Database':
            rds = boto3.client('rds')
            if action == 'Start-up':
                print('Start RDS')
                try:
                    rds.start_db_cluster(DBClusterIdentifier=get_resource_id('Database'))
                    speech_text = start_message
                except Exception as e:
                    print(e)
                    speech_text = 'Failed to start. The database may already be up.'
            elif action == 'Stop':
                try:
                    rds.stop_db_cluster(DBClusterIdentifier=get_resource_id('Database'))
                    speech_text = stop_message
                except Exception as e:
                    print(e)
                    speech_text = 'Failed to stop. The database may already be down.'
            else:
                speech_text = f'{resource_name}What do you do? Please say it again'
                end_session = False
        else:
            speech_text = 'Chot Nani Ittel Kawakarimasen.'
            end_session = False

        handler_input.response_builder.speak(speech_text).set_card(
            SimpleCard('Control AWS Resource', speech_text)).set_should_end_session(end_session)
        return handler_input.response_builder.response


class HelpIntentHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        # type: (HandlerInput) -> bool
        return is_intent_name('AMAZON.HelpIntent')(handler_input)

    def handle(self, handler_input):
        # type: (HandlerInput) -> Response
        speech_text = 'For example, start a web server and say'

        handler_input.response_builder.speak(speech_text).ask(speech_text).set_card(
            SimpleCard('Control AWS Resource', speech_text))
        return handler_input.response_builder.response


class CancelAndStopIntentHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        # type: (HandlerInput) -> bool
        return is_intent_name('AMAZON.CancelIntent')(handler_input) or is_intent_name('AMAZON.StopIntent')(handler_input)

    def handle(self, handler_input):
        # type: (HandlerInput) -> Response
        speech_text = 'goodbye'

        handler_input.response_builder.speak(speech_text).set_card(
            SimpleCard('Control AWS Resource', speech_text))
        return handler_input.response_builder.response


class SessionEndedRequestHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        # type: (HandlerInput) -> bool
        return is_request_type('SessionEndedRequest')(handler_input)

    def handle(self, handler_input):
        # type: (HandlerInput) -> Response
        #Add cleanup logic here

        return handler_input.response_builder.response


class AllExceptionHandler(AbstractExceptionHandler):

    def can_handle(self, handler_input, exception):
        # type: (HandlerInput, Exception) -> bool
        return True

    def handle(self, handler_input, exception):
        # type: (HandlerInput, Exception) -> Response
        #Log exceptions to CloudWatch Logs
        print(exception)

        speech = 'I'm sorry, I didn't understand. Say that once more please.'
        handler_input.response_builder.speak(speech).ask(speech)
        return handler_input.response_builder.response


sb.add_request_handler(LaunchRequestHandler())
sb.add_request_handler(ResourceControlHandler())
sb.add_request_handler(HelpIntentHandler())
sb.add_request_handler(CancelAndStopIntentHandler())
sb.add_request_handler(SessionEndedRequestHandler())
sb.add_exception_handler(AllExceptionHandler())

handler = sb.lambda_handler()

As you can see from the code, the main implementation is the class (ResourceControlHandler) for handling the Intent of ResourceControl (most of the others are copied). In this class, the action and resource Slot values of the request are fetched, and the processing is changed according to the values. For example, if resource is a web server or API server, call the ec2 client and start or stop according to the value of action. Also, set the content to be spoken in speech_text. The value of end_session controls whether you want to end the conversation because it ended normally, or you want to listen back and continue the conversation because the request is strange. Finally, it assembles the response contents with values such as speech_text and end_session and returns the value to make Alexa speak. This is also easy. Once the implementation is complete, build the container image and push it to the ECR. (Omitted)

Lambda settings

Next, let's create a Lambda function. Since we are using a container as the runtime this time, select the container image and specify the function name and the URI of the image pushed to ECR earlier. Permissions make it possible for Lambda to create and use the appropriate IAM roles to work with resources such as EC2 and RDS.

After creating the function, copy the Lambda ARN and return to the Alexa developer console again to configure the endpoint as shown below.

Return to the Lambda setting screen and set the trigger as shown below.

This completes the implementation. Let's go back to the Alexa developer console and see how the skill we created works.

Operation check

Operation check by text input

You can go to the test tab and check the operation of the skill as shown below. It looks like it's working properly. When I checked it in the AWS console, the database was started properly.

スクリーンショット 2020-12-16 12.58.40.png

Operation check by voice input

Let's say "Stop the API server in the development environment". スクリーンショット 2020-12-16 13.13.10.png

... I forgot that I had a bad tongue.

Target person (revised)

--People who want to get a rough idea of the flow of Alexa skill development --People who want to control AWS resources (start / stop EC2 and RDS) with Alexa - People with a good tongue

in conclusion

I went through the development of Alexa skills, but the impression is that if you can understand the concepts such as Intent and Slot, you can make it unexpectedly easily. Also, I realized once again that the voice interface is difficult to handle for people with a bad tongue. I've made it so far, but I think I'll try to write and execute a shell script without using this skill.

[DOCKER] I want to control the start / stop of servers and databases with Alexa