Get the URL of the HTTP redirect destination in Python

Overview

--Get the URL of the HTTP redirect destination in Python 3 --Operation check environment: Python 3.8.5 + macOS Catalina

Source code

Save the following contents with the file name get_redirect.py.

get_redirect.py


import sys
import urllib.request

#Handler class that does not redirect
class NoRedirectHandler(urllib.request.HTTPRedirectHandler):
  # HTTPRedirectHandler.redirect_Override request
  def redirect_request(self, req, fp, code, msg, hdrs, newurl):
    self.newurl = newurl #Hold redirect URL
    return None

#Function to get the redirect URL
def get_redirect_url(src_url):
  #Set handlers that do not redirect
  no_redirect_handler = NoRedirectHandler()
  opener = urllib.request.build_opener(no_redirect_handler)
  try:
    with opener.open(src_url) as res:
      return None #It was a URL that didn't redirect
  except urllib.error.HTTPError as e:
    if hasattr(no_redirect_handler, "newurl"):
      return no_redirect_handler.newurl #Returns the redirect URL
    else:
      raise e #Rethrow because it is an exception that occurred other than the redirect

#Get command line arguments
src_url = sys.argv[1]

#Get the redirect URL
redirect_url = get_redirect_url(src_url)

#Output redirect URL
if redirect_url is not None:
  print(redirect_url)

Execution example.

$ python get_redirect.py https://bit.ly/3kmTOkc
https://t.co/yITSBp4ino
$ python get_redirect.py https://t.co/yITSBp4ino
https://qiita.com/niwasawa
$ python get_redirect.py https://qiita.com/niwasawa

Simplified version

Save the following contents with the file name get_redirect.py.

get_redirect.py


import sys
import urllib.request

#Function to get the redirect URL
def get_redirect_url(src_url):
  with urllib.request.urlopen(src_url) as res:
    url = res.geturl() #Get the final URL
    if src_url == url:
      return None #Not redirected because it is the same as the specified URL
    else:
      return url #Redirecting because it is different from the specified URL

#Get command line arguments
src_url = sys.argv[1]

#Get the redirect URL
redirect_url = get_redirect_url(src_url)

#Output redirect URL
if redirect_url is not None:
  print(redirect_url)

Execution example. In the simplified version, a request is sent to the redirect destination URL, and in the case of multi-stage redirect, the final URL is output.

$ python get_redirect.py https://bit.ly/3kmTOkc
https://qiita.com/niwasawa
$ python get_redirect.py https://t.co/yITSBp4ino
https://qiita.com/niwasawa
$ python get_redirect.py https://qiita.com/niwasawa

Reference material

-[urllib \ .request ---extensible library for opening URLs — Python 3 \ .8 \ .5 documentation](https://docs.python.org/ja/3.8/library/urllib .request.html) --How to get resources on the internet using the urllib package — Python 3 \ .8 \ .5 documentation -HTTP Redirection -HTTP \ | MDN

Recommended Posts

Get the URL of the HTTP redirect destination in Python
Get the caller of a function in Python
How to get the number of digits in Python
[python] Get the list of classes defined in the module
Get the size (number of elements) of UnionFind in Python
Get the desktop path in Python
Get the script path in Python
Get the desktop path in Python
Get the host name in Python
Get the number of specific elements in a python list
Get the index of each element of the confusion matrix in Python
Check the behavior of destructor in Python
Check if the URL exists in Python
The result of installing python in Anaconda
[python] Get the rank of the values in List in ascending / descending order
The basics of running NoxPlayer in Python
In search of the fastest FizzBuzz in Python
[Python] Get the character code of the file
Get rid of DICOM images in Python
Get the title and delivery date of Yahoo! News in Python
Get the number of readers of a treatise on Mendeley in Python
Get a capture of the entire web page in Selenium Python VBA
Get a datetime instance at any time of the day in Python
Get the key for the second layer migration of JSON data in python
Get the contents of git diff from python
Output the number of CPU cores in Python
[Python] Get the files in a folder with Python
Get the weather in Osaka via WebAPI (python)
[Python] Sort the list of pathlib.Path in natural sort
[Python] Get / edit the scale label of the figure
[Python] Get the main topics of Yahoo News
Match the distribution of each group in Python
View the result of geometry processing in Python
Get image URL using Flickr API in Python
Make a copy of the list in Python
Get the X Window System window title in Python
Find the divisor of the value entered in python
[Python] Get the last updated date of the website
How to get the files in the [Python] folder
Find the solution of the nth-order equation in python
The story of reading HSPICE data in Python
[Note] About the role of underscore "_" in Python
About the behavior of Model.get_or_create () of peewee in Python
Output in the form of a python array
Get a glimpse of machine learning in Python
Issue the Amazon CloudFront Signed URL in Python
[Python] Get the day of the week (English & Japanese)
Get the update date of the Python memo file.
the zen of Python
Get date in Python
Get Unix time of the time specified by JST regardless of the time zone of the server in Python
Get the last element of the array by splitting the string in Python and PHP
How to get a list of files in the same directory with python
How to get the variable name itself in python
[Python] Get the official file path of the shortcut file (.lnk)
[Python] Get the text of the law from the e-GOV Law API
[Python] Get the numbers in the graph image with OCR
Get the return code of the Python script from bat
Crawl the URL contained in the twitter tweet with python
The story of FileNotFound in Python open () mode ='w'
Get the result in dict format with Python psycopg2