[Some features added] regarding Lambda at Amazon re: Invent 2015 on October 2015 (https://aws.amazon.com/jp/blogs/aws/aws-lambda-update-python-vpc-increased-function -duration-scheduling-and-more /) has been announced. Personally, the following three are big.
Lambda AWS Lambda is started with Schedule Driven using Data Pipeline I wrote an article before, but the function of Schedule execution is supported as a standard function of Lambda. it was done. In this article, I will write the Lambda function using Python and explain the procedure to execute the Schedule.
The Lambda + Python + Schedule execution environment announced / added this time, but there are some restrictions to be aware of. The major items are as follows.
As the very first case study, it is the most solid case of executing a Python Script that simply prints a Schedule every 5 minutes. Check the Log to see if the result of Schedule execution is output correctly.
From Lambda in the AWS Console, select Create a Lambda function
.
There are many templates for Lambda function, but search and select lambda-canary
. It is a template to execute Python script Schedule on Lambda function.
Since this time the Schedule is executed, select Scheduled Event
as the Event source (I think it is selected by default).
Name
and Description
are OK if you enter an appropriate description.
The Schedule expression
should berate (5 minutes)
by default, so you can leave it as it is.
Set the Python script and IAM role.
The Python code uses: It is a code that just prints.
import json, datetime, commands
def lambda_handler(event, context):
print commands.getoutput('cat /proc/cpuinfo | grep -e "processor" -e "model name"')
print commands.getoutput('cat /proc/meminfo | grep MemTotal')
print commands.getoutput('cat /proc/meminfo | grep MemFree')
print datetime.datetime.now().strftime('%Y/%m/%d %H:%M')
print '-------------------------------'
print event
print event['account']
print context.__dict__
print context.memory_limit_in_mb
Other than Python script, the following settings are required. Others are good with Default mom.
Name
is the name of the Lambda function. Any string is fine (unless it overlaps with other Lambda functions).Role
is the IAM role assigned to the Lambda function. This time we will not access other AWS Resources, so you can select lambda_basic_execution
(if you don't have an IAM role called lambda_basic_execution
, you'll create a role with this name).This is the final confirmation. Select ʻEnable now to enable Schedule execution with this setting. Select ʻEnable now
and press Create function
to complete.
The output of the Lambda function is saved in CloudWatch logs.
From CloudWatch Logs in the AWS Console, select Lambda_Test
. Then, the execution log is added every 5 minutes, and I think that the printed contents remain in the log.
main
, and a method called lambda_handler
is called as an Entry point (with arguments ʻevent,
context). The specification of the Entry point function can be changed with
Handler of
Configuration`.
,
context, which are the arguments of Entry point, differ depending on what is the Event source. The contents of ʻevent
and context
are spit out by print, so please check it.From Lambda in the AWS Console, go to the ʻEvent sources tab of the function of
Lambda_Testyou created earlier. Delete the Schedule execution set in
rate (5 minutes)with
x and select ʻAdd event source
.
In the Dialog of ʻAdd event source, select
Scheduled Event for ʻEvent sources type
and fill the Dialog as follows.
Name
and Description
cron
in schedule expression
and write cron (0/10 * * *? *)
The point to note here is that the Lambda cron grammar is slightly different from the so-called general Linux cron grammar.
Refer to this article Please write cron (0/10 * * *? *)
.
If you select ʻEnable now`, Schedule will be executed with this setting. Check if the execution log is in CloudWatch logs every 10 minutes.
If you want to use a non-Python 2.7 standard library such as numpy
, pandas
or requests
with Lambda + Python, you need to zip and upload the Library together with the Script file itself. there is. For details, see Creating a Deployment Package (Python) on the AWS official website. ).
Save the following Python script to Local (we'll assume you save it with the filename LambdaTest.py
).
The content of Python Script is to HTTP GET google.co.jp
using requests
library and print Status Code and Response Body.
LambdaTest.py
import requests, datetime
def lambda_handler(event, context):
target_URL = 'https://www.google.co.jp'
r = requests.get(target_URL)
print datetime.datetime.now().strftime('%Y/%m/%d %H:%M')
print '-------------------------------'
print r.status_code
print '-------------------------------'
print r.text
A library called requests
, which is not in the Python 2.7 standard, is required, so if you register this Code in Lambda as it is, an Error will occur. Therefore, put the library file of requests
in the same folder, zip it, and register the zip with Lambda.
If you execute pip install
with -t /PathTo/LambdaTest.py
option, the library file will be placed in the specified folder. For example, if LambdaTest.py is in / home / hoge
, you can run pip install -t / home / hoge
.
The file structure should look like the one below. Zip these (2 folders, 1 file) and give them a suitable name (let's say LambdaLibraryZip.zip
)
When uploading the Code of Lambda function with zip, the function name of Entry Point (main function of Python) becomes the file name of * .py
+ .
+ Function of Entry Point in Script
. In this example, you need to set it to LambdaTest.lambda_handler
. Set Handler
on the Configuration
tab to LambdaTest.lambda_handler
From the Code
tab, upload the LambdaLibraryZip.zip
created earlier with ʻUpload a .ZIP file, click
Save and test`, and check if the Python Script is running correctly. Please give me.
Lambda's Python support and Schedule execution support are very good news for me as Python Love. Go to Amazon's 2 tier Architecture or Serverless Architecture I think it will be a big stepping stone (and will be locked in more and more by Amazon ...).
However, I think there are still some areas that cannot be replaced by running Lambda + Python + Schedule.
Lambda has a minimum run period of 5 minutes. I think this will cover many needs, but there are also requirements such as "I want to check the status every minute and monitor life and death". I don't know the AWS service that runs a server-less short-cycle schedule (please let me know if you know it).
The proposal that Amazon has made is [EC2 t2.nano Instance](http://aws.typepad.com/aws_japan/2015/10/ec2-instance-update-x1-sap-hana-t2-nano-websites. html), so I think. Prepare an Instance that is even cheaper than the current t2.micro, and use this to write on the back of the leaflet.
Lambda's maximum processing time has now been increased to 5 minutes (previously 1 minute). When it was 1 minute, use the Lambda function to bring the Log of S3 to Local, parse the contents, and when an error is found, activate Alarm and write the result to RDS. At that time, there was a case that Timeout occurred when the Log size exceeded several hundred MB, but I think that this expansion to 5 minutes will cover many needs. However, there are also requirements that require processing time, such as "Daily ETL Batch processing of a large amount of data".
As far as I know, there are requirements that take time to process with this kind of calculation and data transfer.
I think that is good.
Recommended Posts