[Ruby] I tried text extraction (OCR) with Ruby using Vision API (trained machine learning model)

1 minute read

What is Vision API?

The Google Cloud Vision API provides powerful pre-trained machine learning models via REST and RPC APIs. By assigning labels to images, you can quickly group images into millions of pre-defined categories. It detects objects and faces, reads printed text and handwriting, and creates useful metadata in image catalogs. (Quoted from official document)

Image for analysis

Assign Navi Top Pagecaptureimage(PNGformat)

|anavi.png| |:–|

result of analysis

$ bundle exec rake cloud_vision:text['app/assets/images/anavi.png']
assign navi
New way of working
Find a project/personnel
User Guide
About service
Member registration
Meeting IT projects and external human resources
Efficiency with technology

You can extract text with quite high accuracy (^^)

Code created this time


## How to call
## $ bundle exec rake cloud_vision:text[image_file] # image_file is'path to image file here'

# Load Google Cloud client library
require "google/cloud/vision"

namespace :cloud_vision do
  desc'Run OCR.'
  task :text, [:image_file] do |task, args|
    return unless args[:image_file]

# Instantiate client
    image_annotator = Google::Cloud::Vision.image_annotator

# Run OCR
    response = image_annotator.text_detection(
      image: args[:image_file],
      max_results: 1 # optional, defaults to 10

# Display OCR result
    response.responses.each do |res|
      res.text_annotations.each do |text|
        puts text.description

Environment variables are described in dotenv

#.env file
GOOGLE_CLOUD_PROJECT="Enter your project ID here"
GOOGLE_APPLICATION_CREDENTIALS="Enter the path to the authentication key json file to access the Vision API"

Setup procedure

The setup procedure will be added soon.


Using the Vision API with Ruby

LGTM for follow-ups and articles encourages daily posting. Thank you for your warm 1 click. m(_ _)m