I tried text extraction (OCR) in Ruby using Vision API (Trained Machine Learning Model)

What is Vision API?

Google Cloud's Vision API provides a powerful pre-trained machine learning model via REST API and RPC API. Assigning labels to images allows you to quickly classify images into millions of predefined categories. Detects objects and faces, reads printed text and handwritten input, and creates useful metadata in image catalogs. (Quoted from Official Document)

Image analyzed

Assign Navi top page captured image (PNG format)

anavi.png

result of analysis

$ bundle exec rake cloud_vision:text['app/assets/images/anavi.png']
assign navi
New way of working
Matter/Find talent
User Guide
About service
Login
Member registration
IT projects, encounters with external human resources
Efficiency with technology
00

You can extract text with fairly high accuracy (^^)

Code created this time

lib/tasks/cloud_vision.rake


##How to call
## $ bundle exec rake cloud_vision:text[image_file] # image_to file'Image file path here'

#Loading Google Cloud client library
require "google/cloud/vision"

namespace :cloud_vision do
  desc 'Run OCR.'
  task :text, [:image_file] do |task, args|
    return unless args[:image_file]

    #Instantiate client
    image_annotator = Google::Cloud::Vision.image_annotator

    #Run OCR
    response = image_annotator.text_detection(
      image: args[:image_file],
      max_results: 1 # optional, defaults to 10
    )

    #Displaying OCR results
    response.responses.each do |res|
      res.text_annotations.each do |text|
        puts text.description
      end
    end
  end
end

Environment variables are listed in dotenv

#.env file
GOOGLE_CLOUD_PROJECT="Enter the project ID here"
GOOGLE_APPLICATION_CREDENTIALS="Describe the path to the authentication key json file to access the Vision API"

Setup procedure

The setup procedure will be added soon.

References

Using the Vision API with Ruby

LGTM for follow-ups and articles encourages daily postings. The warmer one click, thank you. m (_ _) m

Recommended Posts

I tried text extraction (OCR) in Ruby using Vision API (Trained Machine Learning Model)
I tried using Google Cloud Vision API in Java
I tried using Elasticsearch API in Java
I tried using Java8 Stream API
I tried using JWT in Java
I tried a calendar problem in Ruby
I tried Oracle's machine learning OSS "Tribuo"
[For beginners] I tried using DBUnit in Eclipse
[For beginners] I tried using JUnit 5 in Eclipse
I tried to implement deep learning in Java
I made blackjack with Ruby (I tried using minitest)
Try using GCP's Cloud Vision API in Java
[API] I tried using the zip code search API
I tried using a database connection in Android development
I tried Mastodon's Toot and Streaming API in Java
[Machine learning] I tried Object Detection with Create ML [Object detection]
I tried using an extended for statement in Java