This time I will try to operate Headless Chrome using Selenium.
The code is summarized in GitHub.
A mode that operates without displaying Chrome introduced from Google Chrome 59. As a result, it can be used in a server environment without automatic testing and UI.
This time I will use the following library
** Installed drivers are saved under ~ / .wdm / by default
Install the latest version of pre-release, selene. The above libraries are included at the same time when selene is installed. Also, six is dependent for some reason, but I didn't enter it together, so I will install it separately.
pip install selene --pre pip install six
It's a simple Google search and screenshots of the results.
sample_selene.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selene.driver import SeleneDriver
from webdriver_manager.chrome import ChromeDriverManager
# run chrome headless
options = Options()
options.add_argument('--headless')
# install chromedriver if not found and start chrome
driver = SeleneDriver.wrap(webdriver.Chrome(executable_path=ChromeDriverManager().install(), chrome_options=options))
# search 'python' in google
driver.get('https://www.google.co.jp/')
input = driver.find_element_by_name('q')
input.send_keys('Python')
input.send_keys(Keys.RETURN)
# save screen shot
driver.save_screenshot('result.png')
driver.quit()
The execution result is as follows.
sample_selene_parallel.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selene.driver import SeleneDriver
from webdriver_manager.chrome import ChromeDriverManager
from joblib import Parallel, delayed
# search 'keyword' in google
def google(url, keyword):
# run chrome headless
options = Options()
options.add_argument('--headless')
driver = SeleneDriver.wrap(webdriver.Chrome(executable_path=ChromeDriverManager().install(), chrome_options=options))
driver.get(url)
input = driver.find_element_by_name('q')
input.send_keys(keyword)
input.send_keys(Keys.RETURN)
# save screen shot
driver.save_screenshot(keyword + '.png')
driver.quit()
url = 'https://www.google.co.jp/'
keywords = ['Python', 'Google', 'Selenium']
# n_jobs=-1 means use all of the resources you can`
Parallel(n_jobs=-1)(delayed(google)(url,keyword) for keyword in keywords)
The execution result is as follows. Python results are omitted.
https://developers.google.com/web/updates/2017/04/headless-chrome
https://github.com/yashaka/selene
https://github.com/SergeyPirogov/webdriver_manager
http://qiita.com/orangain/items/db4594113c04e8801aad
Recommended Posts