Read text in images with python OCR

Installation of tesseract

$ brew install tesseract

Install the library that runs tessetac

$ pip3 install pyocr

Japanese reading settings

$ curl -L -o /usr/local/share/tessdata/jpn.traineddata 'https://github.com/tesseract-ocr/tessdata/raw/master/jpn.traineddata'

$ tesseract --list-langs

List of available languages (4):
eng
jpn
osd
snum

OCR implementation

from PIL import Image
import sys
import pyocr
import pyocr.builders

tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)
# The tools are returned in the recommended order of usage
tool = tools[0]

txt = tool.image_to_string(
    Image.open('{path}'),
    lang="jpn",
    builder=pyocr.builders.TextBuilder(tesseract_layout=6)
)
print(txt)

Recommended Posts

Read text in images with python OCR

Number recognition in images with Python

GOTO in Python with Sublime Text 3

Extract text from images in Python

Read files in parallel with Python

Working with DICOM images in Python

[Python] Read images with OpenCV (for beginners)

Clustering text in Python

Read DXF in python

Text processing in Python

Convert PDFs to images in bulk with Python

Read table data in PDF file with Python

UTF8 text processing in python

Read csv with python pandas

Bordering images with python Part 1

Base64 encoding images in Python 3

Scraping with selenium in Python

Working with LibreOffice in Python

Debugging with pdb in Python

[Python] Get the numbers in the graph image with OCR

OCR from PDF in Python

Read Euler's formula in Python

Working with sounds in Python

Scraping with Tor in Python

Read Namespace-specified XML in Python

Tweet with image in Python

Combined with permutations in Python

Read Fortran output in python

Text extraction (Read API) with Azure Computer Vision API (Python3.6)

Read json data with python

I tried [scraping] fashion images and text sentences in Python.

[Internal_math (1)] Read with Green Coder AtCoder Library ~ Implementation in Python ~

Pixel manipulation of images in Python

Testing with random numbers in Python

Working with LibreOffice in Python: import

Scraping with Selenium in Python (Basic)

How to collect images in Python

CSS parsing with cssutils in Python

Text extraction with AWS Textract (Python3.6)

Numer0n with items made in Python

Read PNG chunks in Python (class)

Generating multilingual text images using Python

Text mining with Python ① Morphological analysis

Enable Python raw_input with Sublime Text 3

Use rospy with virtualenv in Python3

Post multiple Twitter images with python

[python] Read information with Redmine API

Sort large text files in Python

Animate multiple still images with Python

Use Python in pyenv with NeoVim

Load gif images with Python + OpenCV

Heatmap with Dendrogram in Python + matplotlib

Speak Japanese text with OpenJTalk + python

Password generation in texto with python

[Python] Collect images easily with icrawler!

Use OpenCV with Python 3 in Window

Until dealing with python in Atom

Reading and writing text in Python

Read fbx from python with cinema4d

Get started with Python in Blender

Create and read messagepacks in Python