If you copy and paste the text when translating an English dissertation, strange line breaks and characters may be inserted depending on the pdf. I thought it would be easier if I translated it together when formatting that area with python, so that method
Google Translate api is paid, so emulate the browser with selenium and use Google Translate of the browser
pip install selenium
npm install -g phantomjs-prebuilt
from selenium.webdriver import PhantomJS
import time
driver = PhantomJS()
driver.get("https://translate.google.co.jp/?um=1&ie=UTF-8&hl=ja&client=tw-ob#en/ja/")
def eng2jp(eng_text):
driver.find_element_by_id("source").clear()
driver.find_element_by_id("source").send_keys(eng_text)
driver.find_element_by_id("gt-submit").click()
time.sleep(0.1) #wait a bit
return "".join([i.text for i in driver.find_elements_by_xpath('//span[@id="result_box"]//span')])
eng2jp("lement is no longer attached to the DOM")
# returns 'Translate English to Japanese'
Recommended Posts