Try installing Chrome on the Anaconda image, but it still takes a long time to build, so cut out only the Chrome part as a separate image. I was wondering if there was one, but I found out that there is a mechanism called Selenium Grid that can operate WebDriver via REST API, so I tried it.
A Docker image that can use Selenium Grid is officially released, so use this.
SeleniumHQ/docker-selenium: Docker images for Selenium Grid Server (Standalone, Hub, and Nodes).
If you want to use multiple browsers, you need to launch Hub and Node of each browser respectively, but this time I just want to try it with Chrome, so I used the image of Standalone.
docker-compose.yml
version: "3"
services:
chrome:
image: selenium/standalone-chrome
ports:
- 4444:4444
volumes:
- /dev/shm:/dev/shm
The REST API is exposed on port 4444.
The API for operating WebDriver seems to be / wd / hub
.
$ docker-compose up
If you want to use Selenium Grid from Python, use selenium.webdriver.Remote
.
$ pip3 install selenium
main.py
import sys
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
def search(driver, query):
driver.get('https://google.com/')
q = driver.find_element_by_name('q')
q.send_keys(query)
q.submit()
r = driver.find_element_by_class_name('g').find_element_by_class_name('r')
title = r.find_element_by_tag_name('h3').text
url = r.find_element_by_tag_name('a').get_attribute('href')
return title, url
if __name__ == '__main__':
query = ' '.join(sys.argv[1:])
options = {
'command_executor': 'http://localhost:4444/wd/hub',
'desired_capabilities': DesiredCapabilities.CHROME,
}
with webdriver.Remote(**options) as driver:
title, url = search(driver, query)
print(f'{title}\n{url}')
I did a Google search and wrote a script to display the title and URL of the top results.
$ python3 main.py qiita
Qiita
https://qiita.com/
Recommended Posts