This article describes how to use GCP's Vision API by skipping difficult things. Even for the first time, you can use it if you follow the image.
Google Vision API https://cloud.google.com/vision/docs/ocr/?hl=ja (I was able to go without installing the Google Cloud SDK. Thanks to installing google-cloud-vision with pip?)
Windows10 Install Python 3.7 Create a GCP account Preparing the image you want to read
First, get the Google Vision API library. Enter the following command.
pip install --upgrade google-cloud-vision
After installation, we will test it. Open python
from google.cloud import vision
To execute. If this does not cause an error, it is okay. If not
Python 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from google.cloud import vision Traceback (most recent call last): File "
", line 1, in ImportError: cannot import name 'vision' from 'google.cloud' (unknown location)
It will be displayed.
This time we will set up a new GCP project. Press New Project. Create a project name with a descriptive name.
After creating the project, enable the Vision API. Please click in the order of the image numbers.
Enter "google vision api" in the search window. Click the displayed item. Click the Enable button.
Select according to the number in the image. Enter an appropriate service account name. The ID will be entered automatically, so you can leave it as it is. Please select an appropriate role. The next screen will be completed as it is. A service account has been added. Click image.png And select "Create Key". Select JSON and press the Finish button. Move the downloaded file to any location.
Set environment variables. Variables: GOOGLE_APPLICATION_CREDENTIALS Value: Location and name of the downloaded json file (example: c: \ user \ xxxxxx \ desctop \ xxxxxxx.json)
The setting procedure is as follows. Proceed according to the number in the image.
Change the source code brought from Github.
detext.py
"""Detects text in the file."""
from google.cloud import vision
import io
client = vision.ImageAnnotatorClient()
# [START vision_python_migration_text_detection]
path = "C:\\Users\\xxxx\\Desktop\\gcptest\\xxxxx.png "
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
You can change the image by changing the part of "path =" C: \ Users \ xxxx \ Desktop \ gcptest \ xxxxx.png "" on the 7th line in the source code. The following image is used as an example.
Output result
"Do your best Japan Meiji chocolate snack Mountain of mushrooms Fragrant strawberry flavor only TOKTO 0 20 ES Child) ©Tokyo 2020 " bounds: (1,66),(745,66),(745,954),(1,954) "Kanbare" bounds: (72,80),(351,73),(353,152),(74,159) "Nippon" bounds: (353,73),(619,66),(621,145),(355,152) "Meiji" bounds: (208,151),(305,150),(305,190),(208,191) "chocolate" bounds: (307,151),(405,150),(405,189),(307,190) "snack" bounds: (394,156),(535,155),(535,184),(394,185) "mushroom" bounds: (33,167),(471,157),(475,354),(37,364) "of" bounds: (473,157),(569,155),(573,352),(477,354) "Mountain" bounds: (571,155),(735,151),(739,348),(575,352) "Kaoru" bounds: (131,403),(314,390),(322,502),(139,516) "Strawberry" bounds: (315,390),(596,370),(604,482),(323,503) "taste" bounds: (598,370),(671,365),(679,476),(606,482) "only" bounds: (637,551),(736,527),(745,564),(646,588) "TOKTO" bounds: (236,697),(260,690),(263,701),(239,708) "0" bounds: (262,691),(277,687),(280,696),(265,701) "20" bounds: (1,804),(34,804),(34,829),(1,829) "ES" bounds: (1,828),(16,828),(16,844),(1,844) "Child" bounds: (1,910),(18,910),(18,932),(1,932) ")" bounds: (19,912),(24,912),(24,927),(19,927) "©Tokyo" bounds: (72,924),(150,921),(151,951),(73,954) "2020" bounds: (157,921),(210,919),(211,949),(158,951)
As expected, a mountain of mushrooms
Recommended Posts