I tried text extraction (OCR) in Ruby using Vision API (Trained Machine Learning Model)

What is Vision API?

Google Cloud's Vision API provides a powerful pre-trained machine learning model via REST API and RPC API. Assigning labels to images allows you to quickly classify images into millions of predefined categories. Detects objects and faces, reads printed text and handwritten input, and creates useful metadata in image catalogs. (Quoted from Official Document)

Image analyzed

Assign Navi top page captured image (PNG format)


result of analysis

$ bundle exec rake cloud_vision:text['app/assets/images/anavi.png']
assign navi
New way of working
Matter/Find talent
User Guide
About service
Member registration
IT projects, encounters with external human resources
Efficiency with technology

You can extract text with fairly high accuracy (^^)

Code created this time


##How to call
## $ bundle exec rake cloud_vision:text[image_file] # image_to file'Image file path here'

#Loading Google Cloud client library
require "google/cloud/vision"

namespace :cloud_vision do
  desc 'Run OCR.'
  task :text, [:image_file] do |task, args|
    return unless args[:image_file]

    #Instantiate client
    image_annotator = Google::Cloud::Vision.image_annotator

    #Run OCR
    response = image_annotator.text_detection(
      image: args[:image_file],
      max_results: 1 # optional, defaults to 10

    #Displaying OCR results
    response.responses.each do |res|
      res.text_annotations.each do |text|
        puts text.description

Environment variables are listed in dotenv

#.env file
GOOGLE_CLOUD_PROJECT="Enter the project ID here"
GOOGLE_APPLICATION_CREDENTIALS="Describe the path to the authentication key json file to access the Vision API"

Setup procedure

Setup procedure


Using the Vision API with Ruby

