This time, the purpose is to use selenium to control the Chrome browser and download the files stored in the web service. Another important purpose is to do the same download in PHP and Python and compare the results. The original purpose of the download was to attach a webmail, but now it is mostly possible with PHP code such as Slack and Facebook Cybozu. A certain editor who heard this story was told that it was a hack, but it is basically the same as logging in and downloading by yourself, so it is a good process to do it proudly. Below, PHP will use selenium-webdriver, which now feels more complicated than Python. When running with PHP, I referred to the following article. There are various preparations as described in this article. "Automatically operate Chrome with PHP using selenium" After completing the preparation, start selenium.
java -jar selenium-server-standalone.jar &
This code has some unintentional parts. As a result, I used the code because I was able to log in and download it successfully by trial and error. For example, I don't need a screenshot, but I left it as it was because I could download it when I took it. Selenium in PHP may be harder than that. I'm looking forward to seeing what happens if I do the same thing with Python from now on, but I'm wondering if it's okay to do it in the same environment because I have to install selenium separately for Python as well.
download.php
require_once './vendor/autoload.php';
use Facebook\WebDriver\Remote\RemoteWebDriver;
use Facebook\WebDriver\Remote\DesiredCapabilities;
use Facebook\WebDriver\WebDriverExpectedCondition;
use Facebook\WebDriver\WebDriverBy;
//Specify the path of the downloaded chrome driver
$screenPath = $relative.'/g_screenshot.png';
$driverPath = '/usr/local/bin/chromedriver';
putenv("webdriver.chrome.driver=" . $driverPath);
//For specifying options when starting Chrome
$options = new ChromeOptions();
//Specify to start headless
$options->addArguments([
'--no-sandbox',
'--headless', //Specified to start headless. Download folder specification becomes invalid.
'--disable-gpu', //Headless and provisionally required flags
'--ignore-certificate-errors', //Does not display the SSL security certificate error page ("The security certificate for this site is not trusted" page).
]);
$caps = DesiredCapabilities::chrome();
$caps->setCapability(ChromeOptions::CAPABILITY, $options);
$driver = ChromeDriver::start($caps, null, 1000*60*5, 1000*60*10);
$path = dirname(__FILE__).'/data'; #Downloaded WEB server path (path from this program)
$this->setDownloadDir($driver, $path);
$driver->manage()->window()->maximize();
//Virtual login,$atarget is the link to the file to download
$driver->get($wtarget); #The file at this link will be downloaded.
$element = $driver->findElement(WebDriverBy::name('username'));
$element->sendKeys($wuser);
$element = $driver->findElement(WebDriverBy::name('password'));
$element->sendKeys($wpass);
$element->submit();
$driver->manage()->timeouts()->implicitlyWait(5);
$driver->takeScreenshot($screenPath);
//$driver->manage()->getCookies();
//Virtual login completed
This is a problem Python 3.7.5 (default, Nov 1 2019, 19:15:52) still gives an error, but another one It works fine with Python 2.7.17 (default, Oct 25 2019, 10:08:31). Perhaps it is the relationship between the modules that the two environments call.
webdriver.py
#It's an endless memo for myself.
from selenium import webdriver # from <module> import <driver>
# ImportError: cannot import name 'webdriver' from 'selenium' (unknown location)
I think it's mostly a path issue, but Python 3.7.5 is still in error.
I was testing with a Pytho sample when I started selenium with PHP, but at that time the working Python sample stopped when I loaded the webdriver. The error indicates the version of google-chrome and the version of chromedriver, and it feels like it's crashing. For PHP, changing the currently implemented google-chrome and chromedriver will cause google-chrome to crash. There seems to be some reason for this relationship. I'm a little worried.
Recommended Posts