Text extraction (Read API) with Azure Computer Vision API (Python3.6)

Introduction

I tried to extract text from the image

Development environment

Introduction

    1. Log in to the Azure portal (https://portal.azure.com/)
  1. Create a resource for the Computer Vision API image.png

    1. Make a note of the key and endpoint image.png

Four. Please install the required libraries.

pip install matplotlib
pip install pillow
pip install opencv-python
pip install --upgrade azure-cognitiveservices-vision-computervision 

Five. Enter the key and endpoint you wrote down and run the following code!

subscription_key = "<your subscription key>"
endpoint = "<your API endpoint>"

The endpoint seems to work even if you specify the region.

endpoint = "https://<your region>.api.cognitive.microsoft.com/"

Extract text from image URL

Quick Start: Extract printed and handwritten text using Computer Vision's REST API and Python (https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision) / quickstarts / python-hand-text)

import json
import os
import os.path
import sys
import requests
import time
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
from PIL import Image
from io import BytesIO
 import cv2

subscription_key = "<your subscription key>"
endpoint = "<your API endpoint>"
 endpoint = "https://japanwest.api.cognitive.microsoft.com/"
text_recognition_url = endpoint + "vision/v3.1/read/analyze"

image_url = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg "
headers = {'Ocp-Apim-Subscription-Key': subscription_key}
data = {'url': image_url}
response = requests.post(text_recognition_url, headers=headers, json=data)
response.raise_for_status()

operation_url = response.headers["Operation-Location"]
analysis = {}
poll = True
while (poll):
    response_final = requests.get(response.headers["Operation-Location"], headers=headers)
    analysis = response_final.json()
    
    print(json.dumps(analysis, indent=4))

    time.sleep(1)
    if ("analyzeResult" in analysis):
        poll = False
    if ("status" in analysis and analysis['status'] == 'failed'):
        poll = False

polygons = []
if ("analyzeResult" in analysis):
    polygons = [(line["boundingBox"], line["text"])
                for line in analysis["analyzeResult"]["readResults"][0]["lines"]]

image = Image.open(BytesIO(requests.get(image_url).content))
ax = plt.imshow(image)
for polygon in polygons:
    vertices = [(polygon[0][i], polygon[0][i+1])
                for i in range(0, len(polygon[0]), 2)]
    text = polygon[1]
    patch = Polygon(vertices, closed=True, fill=False, linewidth=2, color='y')
    ax.axes.add_patch(patch)
    plt.text(vertices[0][0], vertices[0][1], text, fontsize=20, va="top")
plt.show()
input output
readsample.jpg Figure_1.png

Extract text from local image

import json
import os
import os.path
import sys
import requests
import time
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
from PIL import Image
from io import BytesIO
import cv2

subscription_key = "<your subscription key>"
endpoint = "<your API endpoint>"
 endpoint = "https://japanwest.api.cognitive.microsoft.com/"
text_recognition_url = endpoint + "vision/v3.1/read/analyze"

headers = {'Ocp-Apim-Subscription-Key': subscription_key, 'Content-Type': 'application/octet-stream'}
filename = "readsample.jpg "
root, ext = os.path.splitext(filename)
 image_data = open(filename, "rb").read()
color = cv2.imread(filename, cv2.IMREAD_COLOR)
cv2.namedWindow("color", cv2.WINDOW_NORMAL)
cv2.imshow("color", color)
cv2.waitKey(1)
image_data = cv2.imencode(ext, color)[1].tostring()
response = requests.post(text_recognition_url, headers=headers, data=image_data)
response.raise_for_status()

operation_url = response.headers["Operation-Location"]
analysis = {}
poll = True
while (poll):
    response_final = requests.get(
        response.headers["Operation-Location"], headers=headers)
    analysis = response_final.json()
    
    print(json.dumps(analysis, indent=4))

    time.sleep(1)
    if ("analyzeResult" in analysis):
        poll = False
    if ("status" in analysis and analysis['status'] == 'failed'):
        poll = False

polygons = []
if ("analyzeResult" in analysis):
    polygons = [(line["boundingBox"], line["text"])
                for line in analysis["analyzeResult"]["readResults"][0]["lines"]]

 image = Image.open(BytesIO(image_data))
image = Image.fromarray(color)
ax = plt.imshow(image)
for polygon in polygons:
    vertices = [(polygon[0][i], polygon[0][i+1])
                for i in range(0, len(polygon[0]), 2)]
    text = polygon[1]
    patch = Polygon(vertices, closed=True, fill=False, linewidth=2, color='y')
    ax.axes.add_patch(patch)
    plt.text(vertices[0][0], vertices[0][1], text, fontsize=20, va="top")
plt.show()
input output
readsample.jpg Figure_3.png

Use the Computer Vision client library

Quick Start: Use Computer Vision Client Library (https://docs.microsoft.com/en-us/azure/cognitive-services/Computer-vision/quickstarts-sdk/client-library?pivots=programming-language -python & tabs = visual-studio)

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials

from array import array
import os
from PIL import Image
import sys
import time
import cv2 
from io import BytesIO

subscription_key = "<your subscription key>"
endpoint = "<your API endpoint>"
 endpoint = "https://japanwest.api.cognitive.microsoft.com/"

computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

print("===== Batch Read File - remote =====")
remote_image_handw_text_url = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg "

recognize_handw_results = computervision_client.read(remote_image_handw_text_url, raw=True)
operation_location_remote = recognize_handw_results.headers["Operation-Location"]
operation_id = operation_location_remote.split("/")[-1]

while True:
    get_handw_text_results = computervision_client.get_read_result(operation_id)
    if get_handw_text_results.status not in ['notStarted', 'running']:
        break
    time.sleep(1)

if get_handw_text_results.status == OperationStatusCodes.succeeded:
    for text_result in get_handw_text_results.analyze_result.read_results:
        for line in text_result.lines:
            print(line.text)
            print(line.bounding_box)
print()
===== Batch Read File - remote =====
The quick brown fox jumps
[38.0, 650.0, 2572.0, 699.0, 2570.0, 854.0, 37.0, 815.0]
over
[184.0, 1053.0, 508.0, 1044.0, 510.0, 1123.0, 184.0, 1128.0]
the lazy dog!
[639.0, 1011.0, 1976.0, 1026.0, 1974.0, 1158.0, 637.0, 1141.0]

Thank you for your hard work.

Recommended Posts

Text extraction (Read API) with Azure Computer Vision API (Python3.6)
Text extraction with GCP Cloud Vision API (Python3.6)
Text extraction with AWS Textract (Python3.6)
[python] Read information with Redmine API
[Azure] Hit Custom Vision Service with Python
Read text in images with python OCR
Recent Ability of Image Recognition-MS State-of-the-art Research Results Using Computer Vision API with Python
Read csv with python pandas
Use Twitter API with Python
Web API with Python + Falcon
Play RocketChat with API / Python
Call the API with python3.
Use subsonic API with python3
Read json data with python
Flow of extracting text in PDF with Cloud Vision API
Create Awaitable with Python / C API
GOTO in Python with Sublime Text 3
Get reviews with python googlemap api
Run Rotrics DexArm with python API
Quine Post with Qiita API (Python)
Text mining with Python ① Morphological analysis
Enable Python raw_input with Sublime Text 3
Hit the Etherpad-lite API with Python
[python] Extract text from pdf and read characters aloud with Open-Jtalk
Speak Japanese text with OpenJTalk + python
Image upload & download to Azure Storage. With Python + requests + REST API
Read fbx from python with cinema4d
Efficiently develop Azure Python apps with CI/CD
Collecting information from Twitter with Python (Twitter API)
Google Cloud Vision API sample for python
Easy keyword extraction with TermExtract for Python
English speech recognition with python [speech to text]
Automatically create Python API documentation with Sphinx
Read CSV file with python (Download & parse CSV file)
Simple Slack API client made with Python
Retrieving food data with Amazon API (Python)
Using Python and MeCab with Azure Databricks
[C] [python] Read with AquesTalk on Linux
Let's read the RINEX file with Python ①
Working with Azure CosmosDB from Python Part.2
Use Google Cloud Vision API from Python
Transcription of images with GCP's Vision API
[Python] Region Covariance: Covariance matrix and computer vision
[Python] Read images with OpenCV (for beginners)
[Python] Quickly create an API with Flask
Text mining with Python ② Visualization with Word Cloud
[Automation] Read a Word document with Python
[Python] Get Python package information with PyPI API
Use Python and MeCab with Azure Functions
[Automation] Read mail (msg file) with Python
How to read a CSV file with Python 2/3
Flask can't be RESTful with azure API Apps
[Computer vision] Epipolar geometry to learn with cats
Extract text from PowerPoint with Python! (Compatible with tables)
Read data with python / netCDF> nc.variables [] / Check data size
Read Python csv data with Pandas ⇒ Graph with Matplotlib
I tried "License OCR" with Google Vision API
Edge extraction with python + OpenCV (Sobel filter, Laplacian filter)
Read JSON with Python and output as CSV
[Python] How to read excel file with pandas
Medical image analysis with Python 1 (Read MRI image with SimpleITK)