Https access via proxy with Python web scraping was easy with requests

I am trying web scraping with urllib and Beautifulsoup in Python3. Last time, I dealt with a communication error due to Proxy. What to do if there is no response due to Proxy settings in Python web scraping Communication by http worked well with the above method, but when it became an https site, communication was not established and an error occurred. I'm in trouble because recent websites have a lot of https. .. : disappointed_relieved: Adding the "https" item to proxies as shown below does not solve the problem. proxies={"http":"http:proxy.-----.co.jp/proxy.pac", "https":"http:proxy.-----.co.jp/proxy.pac"}

When I was looking up, I found a library called requests. I tried to use it instead of urllib and it was surprisingly easy to solve.

An example of how to use it is as follows.

`requsts_sample.py`


import requests

proxies = {
"http":"http://proxy.-----.co.jp/proxy.pac",
"https":"http://proxy.-----.co.jp/proxy.pac"
}
r = requests.get('https://github.com/timeline.json', proxies=proxies)
print(r.text)

When using Beautifulsourp, it seems that you should pass the content of the object obtained by requests.get. Here is a simple sample.

`python::requests_beautifulsoup_sample.py`


import requests
from bs4 import BeautifulSoup

proxies = {
'http':'http://proxy.-----.co.jp/proxy.pac',
'https':'http://proxy.-----.co.jp/proxy.pac'
}

def getBS(url):
    html = requests.get(url, proxies=proxies)
    bsObj = BeautifulSoup(html.content, "html.parser")
    return bsObj

htmlSource = getBS("https://en.wikipedia.org/wiki/Kevin_Bacon")

#Show links that exist on the page
for link in htmlSource.findAll("a"):
    if 'href' in link.attrs:
        print(link.attrs['href'])

The requests library was included when I installed Python 3.5.2 on Anaconda. You can check the packages installed by Anaconda Navigator. If you installed the GUI on Windows, you can find it in Windows-> All Programs-> Anaconda3-> Anaconda Navigator.

Click here for Quickstart of requests library

Recommended Posts

Https access via proxy with Python web scraping was easy with requests

Easy web scraping with Python and Ruby

Web scraping with python + JupyterLab

Easy scraping with Python (JavaScript / Proxy / Cookie compatible version)

Easy web scraping with Scrapy

Web scraping beginner with python

Web scraping with Python First step

I tried web scraping with python.

WEB scraping with Python (for personal notes)

Getting Started with Python Web Scraping Practice

[Personal note] Web page scraping with python3

Web scraping with Python ② (Actually scraping stock sites)

Horse Racing Site Web Scraping with Python

Getting Started with Python Web Scraping Practice

Easy web app with Python + Flask + Heroku

Practice web scraping with Python and Selenium

[For beginners] Try web scraping with Python

Scraping with Python

Scraping with Python

AWS-Perform web scraping regularly with Lambda + Python + Cron

Let's do web scraping with Python (weather forecast)

Let's do web scraping with Python (stock price)

Try scraping with Python.

Data analysis for improving POG 1 ~ Web scraping with Python ~

Quick web scraping with Python (while supporting JavaScript loading)

I was addicted to scraping with Selenium (+ Python) in 2020

Python beginners get stuck with their first web scraping

[For beginners] Web scraping with Python "Access the URL in the page to get the contents"

Scraping with Selenium [Python]

Retry with python requests

Python web scraping selenium

Scraping with Python + PyQuery

Get data from database via ODBC with Python (Access)

Scraping RSS with Python

[Raspberry Pi] Scraping of web pages that cannot be obtained with python requests + Beautiful Soup

Web crawling, web scraping, character acquisition and image saving with python

Easy deep learning web app with NNC and Python + Flask

I tried scraping with Python

Scraping with selenium in Python

Scraping with Selenium + Python Part 1

[Co-occurrence analysis] Easy co-occurrence analysis with Python! [Python]

Web scraping notes in python3

Scraping with chromedriver in python

Festive scraping with Python, scrapy

Save images with web scraping

Easy folder synchronization with Python

Scraping with Selenium in Python

Scraping with Tor in Python

Web API with Python + Falcon

Web scraping using Selenium (Python)

Scraping weather forecast with python

Easy Python compilation with NUITKA-Utilities

Easy HTTP server with Python

Easy proxy login with django-hijack

Scraping with Selenium + Python Part 2

Access Google Drive with Python

Web application with Python + Flask ② ③

I tried scraping with python

Streamline web search with python

Web application with Python + Flask ④

Try scraping with Python + Beautiful Soup