When you want to operate the browser with selenium × chrome from AWS Lambda, I'm addicted to how to make a layer, so I'll post it.
I want to run selenium x chrome on AWS Lambda.
windows 10 Pro python 3.7 chromdriver 2.37 headless-chromium 64.0.3282.167
-How to create and call AWS lambda function -How to make a python program that can operate the browser with selenium webdriver.
First of all, AWS Lambda Layers are like common functions that can be used in common from lamda. It can be called in the form referenced from the Lambda main unit as shown below. By cutting out a part of the function, you can make the lambda body lighter. By making it lighter, you can avoid the disadvantage that the code cannot be displayed because the capacity of the module set is too large, for example, as shown below.
This time, I wanted to implement it with chromeDriver from python with selenium, so Create the following two layers.
** 1. Layer to store selenium library ** ** 2. Layer to store chromeDriver **
↓ Layer configuration
The method of making is described below.
Execute the following command in any folder. I think you've already done pip install selenium, Since it is a task to prepare the library to be installed, it is done in a place different from the execution module.
Preparation of selenium module
pip install -t ./python/lib/python3.7/site-packages selenium
Zip it from the python folder.
Create a new Layers from the console screen of lambda. At the time of creation, specify the zip file of 3 and create it.
Reference the Layers created in 4 from the Lambda function. If you select ** Custom Layer ** from the Layer settings screen, it will be displayed in the options.
Call it from lambda with import.
import statement
from selenium import webdriver
If the above configuration is incorrect, the following error will occur.
error statement
[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'selenium'
Prepare the officially distributed driver (2 points). chromdriver Distributor: https://chromedriver.storage.googleapis.com/index.html?path=2.37/ headless-chromium Distributor: https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip
Place the above two points in the same folder and zip them in a linux environment.
If you zip it in a windows environment, you will get the following error even if you run lamda. Even in the linux environment, if the permission of each file is not 777, the following error will occur.
error statement
[ERROR] WebDriverException: Message: 'chromedriver' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home
For windowsPC, there are the following methods to zip compress in linux environment. I could do either.
Reference the Layers created in 4 from the Lambda function.
Call it by specifying the following path from lambda. In the AWS Lambda layers specification, it is located in ** / opt **, so specify as follows.
import statement
driver = webdriver.Chrome(executable_path ="/opt/chromedriver", chrome_options=options)
If the path is incorrect, such as when there is no opt, an error will occur.
error statement
[ERROR] WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
Both lamda and selenium are slow, so set the timeout value of lamda longer. The default is 3 seconds, so it almost times out and the following error occurs.
error statement
Task timed out after XX.XX seconds
Setting the timeout value
The final code for lamda is below.
lamda_function.py
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
def lambda_handler(event, context):
LINE_NOTIFY_URL = "https://notify-api.line.me/api/notify"
options = Options()
options.binary_location = '/opt/headless-chromium'
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--single-process')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(executable_path ='/opt/chromedriver', chrome_options=options)
driver.get("https://xxxxxxxxxxx")
Looking at other articles, I found that most of them used serverless.yml or cloudFormation. I had little knowledge about that, so I took the above method. I hope it will be helpful for those who install selenium without a server for the first time.
Recommended Posts