At first. What is OCR?

OCR is a technology for extracting character strings from images. There is a technology that translates the screen read by your smartphone with Google Translate. That is OCR. It extracts text from the scanned image and performs natural language processing.

So what can you do with OCR? Is it convenient for everyday life?

I know I'm pulling text out of the image, but what else can I use it for? It will be a story. For example, you can read the leaflets you received, prints from your company or school, and read them into Word files. In addition, you can convert the contents on the blackboard or whiteboard into text without having to write them down.

Advance preparation

・ Installation of Python3 ・ Installation of pyocr ・ Pillow installation ・ Installation of tesseract OCR

Python3 installation is long, so I will omit it I'm a Mac user, so I'll only explain that side.

For Windows users, please refer to the author of the link below.

https://qiita.com/henjiganai/items/7a5e871f652b32b41a18

Then for Mac.

pip install Pillow

pip3 install pillow

pip install pyocr

pip3 install pyocr

brew install tesseract

That's it. Note that you may not be able to execute it without sudo in front.

Practice

Targets only png images. I don't know about support for other formats.


import glob
import pyocr.builders
from PIL import Image

#When you receive the file name of the image,Returns txt

class OCRs:

    def __init__(self):

        self.tools = pyocr.get_available_tools()
        self.tool = self.tools[0]
        self.langs = self.tool.get_available_languages()
        self.lang = self.langs[0]
        self.res = False
        print(self.tools)

        if len(self.tools) != 0:

            self.res = True

    def read(self, file_name):

        if not self.res:

            return 'error'

        else:


            txt = self.tool.image_to_string(

                Image.open(file_name),
                lang=self.lang,
                builder=pyocr.builders.TextBuilder()
            )

            return txt

The tinkering name such as OCRs is absent, so go to the content. First, declare the module to be used first.

glob is a module for getting the path in a file (directory). pyocr is a module that bridges Python with an engine called tesseract for performing OCR in Python. PIL is a module required to load images.

And init has a set of things such as tool and lang that are used only once (no need to call). If res does not have an OCR engine, it should have a value of False, and if it does, it should have a value of True.

Yes, it's the main read function. What we are doing is receiving the file name as an argument, OCR (extracting the character string) and returning it as text.

First, determine if you have an OCR engine. If not, the string error is returned. After that, set the image and language, etc., receive the text in txt, and then return it.

Yes, let's go to the main function.


if __name__ == '__main__':

    cl = OCRs()
    cl.__init__()

    file_names = glob.glob('/Users/sa/Desktop/programming/target_folder/*')

    for file_name in file_names:
        
        if cl.read(file_name) == 'error':
            
            print('OCR software was not found.')
            break
        
        else:
        
            print(cl.read(file_name))

Let's take a look. First, assign the previous class to cl and then call init. Initial setting is complete. Then, use glob to specify the image folder you want to set (OCR). I've modified my configuration a bit for people because it's still difficult to manipulate directories. Don't say stupid? ??

#Directory you want to specify(folder)Put in.
filenames = glob.glob('hogehoge/*')

#Now you can get all the filenames in hogehoge.

Then, using a repeating for statement, throw all the elements into the previous function. If an error is returned, the OCR software is not included.

that's all. If you want to specify only this one image! !! In that case, call it as follows.

cl.read(filename)

Try to extract a character string from an image with Python3

At first. What is OCR?

So what can you do with OCR? Is it convenient for everyday life?

Advance preparation

Practice