Selenium is a framework for automating web browser operations. Developed by ThoughtWorks in 2004 to automate UI testing of web applications. https://selenium.dev/history/
Originally developed for the purpose of UI testing and JavaScript testing of web applications, it is used for various purposes other than testing, such as task automation and website crawling.
This article describes how to build an environment and use it in Python to operate Chrome via Selenium.
TL;DR --Environment building is super easy with the official Docker image --Write code to automate how to operate the browser --Can also be used as a crawler
To operate the browser automatically using Selenium, you need to install the following.
--Web browser --Chrome, Firefox, IE, Opera, etc.
Here, we will introduce two types of environment construction when using Selenium with Python, one is to use Docker and the other is to create an environment directly on a local PC.
It's very easy to set up using the Docker image officially published by Selenium. https://github.com/SeleniumHQ/docker-selenium
This method has such a configuration.
The browser and Remote WebDriver run on top of the Docker container, and Selenium connects to the Remote WebDriver over the network from another host.
Personally, this method is the easiest to set up and is the most recommended.
Simply execute the following command to start the Chrome environment that can be operated from Selenium.
$ docker run -d -p 4444:4444 -v /dev/shm:/dev/shm selenium/standalone-chrome:3.141.59-xenon
WebDriver has a slightly annoying problem that it doesn't work unless you choose the version that matches your browser version, but the official Docker image has both the browser and WebDriver installed so you can use it right away.
Install the library for using Selenium on the machine that runs Selenium's Python code. Python's Selenium bindings can be installed with pip.
$ pip install selenium
You can run Selenium from Python with code like this:
from selenium import webdriver
#Set Chrome options
options = webdriver.ChromeOptions()
options.add_argument('--headless')
#Connect to Selenium Server
driver = webdriver.Remote(
command_executor='http://localhost:4444/wd/hub',
desired_capabilities=options.to_capabilities(),
options=options,
)
#Operate the browser via Selenium
driver.get('https://qiita.com')
print(driver.current_url)
#Quit the browser
driver.quit()
Next, I will write how to build an environment for running Selenium locally on the Mac.
This method has such a configuration.
The browser and WebDriver are all running locally, and Selenium connects to the local Driver.
Many people may already have Chrome installed, but install Chrome normally.
Then check the version of Chrome installed to decide which version of WebDirver to install. In my environment, 78.0.3904.108
was installed.
Download the Chrome WebDriver binary.
In Python, there is a person who has published a convenient library called chromedriver-binary
that downloads the WebDriver binary and sets the path, so I will use this.
https://github.com/danielkaiser/python-chromedriver-binary
Since it is necessary to install WebDriver corresponding to the version of Chrome, install WebDriver by specifying only the major version with pip as follows.
$ pip install chromedriver-binary==78.*
Python's Selenium bindings are installed with pip.
$ pip install selenium
In the local environment, you can run Selenium with the following code.
import chromedriver_binary # nopa
from selenium import webdriver
#Set WebDriver options
options = webdriver.ChromeOptions()
options.add_argument('--headless')
print('connectiong to remote browser...')
driver = webdriver.Chrome(options=options)
driver.get('https://qiita.com')
print(driver.current_url)
#Quit the browser
driver.quit()
Compared to the previous Docker example, the difference is that the Chrome
class is specified for WebDriver.
Running the above code will launch Chrome on your PC. If you comment out the ʻoptions.add_argument ('--headless')` part, you can see how the browser screen is displayed and moving.
Now that you can build an environment for using Selenium, let's see how to actually use it.
Now, let's run Chrome on Selenium and try the following operations.
Visit Qiita's Chanmoro profile page https://qiita.com/Chanmoro
Go to the second page of the article list displayed in "Recent Articles"
Get the URL and display the title of the article displayed at the very beginning of the second page
Here, we will use Selenium Server with Docker introduced in the environment construction at the beginning. Even in a local environment, the only difference is the part that sets up WebDrive, and the same code can be used for subsequent operations.
First of all, I will show you the whole code.
from selenium import webdriver
from selenium.webdriver.common.by import By
# x.Set Chrome launch options
options = webdriver.ChromeOptions()
options.add_argument('--headless')
# x.Open a new browser window
print('connectiong to remote browser...')
driver = webdriver.Remote(
command_executor='http://localhost:4444/wd/hub',
desired_capabilities=options.to_capabilities(),
options=options,
)
# 1.Visit Qiita's Chanmoro profile page
driver.get('https://qiita.com/Chanmoro')
print(driver.current_url)
# > https://qiita.com/Chanmoro
# 2.Go to the second page of the article list displayed in "Recent Articles"
driver.find_element(By.XPATH, '//a[@rel="next" and text()="2"]').click()
print(driver.current_url)
# > https://qiita.com/Chanmoro?page=2
# 3.Get the URL for the title of the article displayed at the very beginning of the second page
article_links = driver.find_elements(By.XPATH, '//div[@class="ItemLink__title"]/a')
print(article_links[0].text)
# > Python -Dynamically call a function from a string
print(article_links[0].get_attribute('href'))
# > https://qiita.com/Chanmoro/items/9b0105e4c18bb76ed4e9
# x.Quit the browser
driver.quit()
Let's explain step by step.
First, set Chrome startup options before launching Chrome.
Optional classes are separate for each browser, and there are browser-compatible classes such as ChromeOptions
for Chrome and FirefoxOptions
for Firefox.
In Chrome, the headless
option launches the browser without displaying the screen.
Basically, I think that it is mostly operated in headless mode, but if you want to visually check how the screen is operated during debugging etc., you can also use it without specifying this option.
options = webdriver.ChromeOptions()
options.add_argument('--headless')
Then open a new window from Selenium. If the browser is not running at this time, it will be started.
If you are using Selenium Server as introduced in the environment construction at the beginning, use the Remote
class and specify the browser type with the argument of desired_capabilities
.
In this example, ʻoptions contains a
ChromeOptions` object, so we will specify Chrome.
# NOTE:To run Selenium remotely, specify the Remote WebDriver as follows:
driver = webdriver.Remote(
command_executor='http://localhost:4444/wd/hub',
desired_capabilities=options.to_capabilities(),
options=options,
)
Access the URL specified by calling the get ()
method of the WebDriver object.
driver.get('https://qiita.com/Chanmoro')
You can access the URL currently displayed by the window with current_url
and the HTML displayed with page_source
.
print(driver.current_url)
print(driver.page_source)
Click the link to go to the second page displayed at the bottom of "Recent Articles" on your profile screen to go to the page.
In Selenium, get the target element with find_element
as shown below, and click by callingclick ()
for that element.
Here, the a tag that has the attribute of rel =" next "
and contains the character string of 2
is specified and clicked.
driver.find_element(By.XPATH, '//a[@rel="next" and text()="2"]').click()
In addition to XPath, Selenium allows you to specify the target element in various ways such as CSS selector, ID specification, name attribute specification, class specification, and so on. Personally, I like XPath the most because it can powerfully kill any element in one shot. It's a little tricky for the first moment until I get used to writing.
In this example, the XPath is written by specifying By.XPATH
infind_element ()
, but the same thing can be done by specifying XPath usingfind_element_by_xpath ()
.
Check this document for details on the methods for specifying elements. https://selenium-python.readthedocs.io/locating-elements.html
This time, we will get multiple elements specified by find_elements ()
.
The find_element ()
used at the time of the click returns only the first element that matches the specified condition, but find_elements ()
returns an array of elements even if there are multiple matches.
On the profile page, ʻItemLink__title` is given to the class of the element of the article title in the list, so I relied on that to get the list of article titles.
article_links = driver.find_elements(By.XPATH, '//div[@class="ItemLink__title"]/a')
You can get the text written in the tag with text
for the retrieved element, and you can get attributes such as href
with get_attribute ()
.
print(article_links[0].text)
print(article_links[0].get_attribute('href'))
Finally, when the process is complete, call quit ()
to exit the browser.
driver.quit()
If you forget to call quit ()
when an error occurs during Selenium processing, it will lead to a bug that the browser will stay running and memory consumption will increase steadily, so try- Make sure to handle the error with catch etc. and always call quit ()
at the end of the program.
Selenium can also be used for crawler applications. Use Selenium to perform JavaScript drawing and button click operations, and use some HTML parser for the displayed page to get the elements.
You can use the HTML parser provided by Selenium itself for parsing, but it is recommended to use a library such as BeautifulSoup because it has useful functions for parsing and can be implemented more flexibly.
Specifically, as shown in the code below, the HTML obtained by driver.page_source
is parsed and the data is obtained.
from bs4 import BeautifulSoup
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--headless')
driver = webdriver.Remote(
command_executor='http://localhost:4444/wd/hub',
desired_capabilities=options.to_capabilities(),
options=options,
)
driver.get('https://qiita.com/Chanmoro')
#Create a BeautifulSoup object from the HTML displayed in the browser and parse it
soup = BeautifulSoup(driver.page_source, 'html.parser')
articles = soup.select('.ItemLink')
for article in soup.select('.ItemLink'):
#Get a list of article titles displayed on your profile page
print(article.select_one('.ItemLink__title a').get_text())
driver.quit()
For more information on how to use BeautifulSoup, please read the wonderful article "Beautiful Soup in 10 Minutes"! (annoying) https://qiita.com/Chanmoro/items/db51658b073acddea4ac
By the way, in this article, I introduced the environment setting and basic usage of Selenium. It's very easy to build an environment using Docker, but it's not too difficult to build an environment on a local PC, so you can try either method immediately.
Selenium can be used to automate UI testing and browser operations, as well as to crawl sites like SPA, which are rendered in JavaScript, as I mentioned at the end.
Also, for UI testing purposes, there is a very useful mechanism called Selenium Grid that allows you to run tests in multiple browsers at the same time. Selenium Grid is a great mechanism that allows you to pool multiple types of browsers such as Chrome, Firefox, IE and multiple versions of browsers and run the same test in parallel on those multiple browsers via Hub. is.
Selenium Grid also makes it easy to create an environment using docker, so please check the Selenium Grid documentation and docker-selenium README for details. https://selenium.dev/documentation/en/grid/ https://github.com/SeleniumHQ/docker-selenium
In writing this article, I read History of Selenium and learned that ThoughtWorks engineers first created the concept and core functionality of Selenium. Personally, I feel that it's suddenly cool just because it was made by a person from ThoughtWorks.
There is also a very useful browser extension called Selenium IDE that allows you to manually move the browser to record and play back its operations, which is Japan. It seems that a person Shinya Kasatani (@shinya) was developed.
Selenium should have a revolutionary impact on web application development, and it's really amazing that such software was created and released as OSS and is widely used all over the world.
Now, let's all understand how to use Selenium, stand on the shoulders of giants, and have a fun browser automation life today!
Recommended Posts