When I look at the page in a browser, it is in Japanese, but when I download it via scrapy, the English page may be downloaded. This is because Accept-Lauguage when scrapy makes a request to the web server is ʻen` by default, so you can request a Japanese page by writing the following contents in settings.py.
DEFAULT_REQUEST_HEADERS = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'ja,en-US;q=0.8,en;q=0.6',
}
reference: https://doc.scrapy.org/en/latest/topics/settings.html#std:setting-DEFAULT_REQUEST_HEADERS
Recommended Posts