OCR in Java (character recognition from images)

things to do

Get text from images using OSS tess4j

Maven Copy and paste from mvnrepository to POM.xml

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>4.3.1</version>
</dependency>

tess4j-4.3.1.jar is downloaded キャプチャ.PNG

If Maven cannot be used from here

Japanese recognition file

Get the Japanese recognition file (jpn.traineddata) from GitHub repository

Source

OcrTrial.java


import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;
import net.sourceforge.tess4j.ITesseract;
import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;

public class OcrTrial {
	public static void main(String[] args) throws IOException, TesseractException {
		//Load image
		File file = new File("C:\\work\\INPUT.JPG");
		BufferedImage img = ImageIO.read(file);

		ITesseract tesseract = new Tesseract();
		tesseract.setDatapath("C:\\work"); //Language file (jpn.traineddata)))
		tesseract.setLanguage("jpn"); //Specify "Japanese" as the analysis language

		//analysis
		String str = tesseract.doOCR(img);

		//result
		System.out.println(str);
	}
}

Image file set to INPUT

INPUT.JPG

Output result

キャプチャ.JPG

Summary

This is the mistake 〇 (pictogram) × (Pivot Gram)

The recognition rate seems to be high if the image can be clearly identified as characters.

next time

-[] Try various images

Recommended Posts

OCR in Java (character recognition from images)
Correct the character code in Java and read from the URL
Guess the character code in Java
[Java] Remove whitespace from character strings
Study Deep Learning from scratch in Java.
Call Java method from JavaScript executed in Java
Reverse Key from Value in Java Map
Using JavaScript from Java in Rhino 2021 version
Get history from Zabbix server in Java
Call Visual Recognition in Watson Java SDK
GetInstance () from a @Singleton class in Groovy from Java
Partization in Java
Java method call from RPG (method call in own class)
Changes in Java 11
Rock-paper-scissors in Java
How to get Class from Element in Java
Text extraction in Java from PDF with pdfbox-2.0.8
Capture and save from selenium installation in Java
Get unixtime (seconds) from ZonedDateTime in Scala / Java
[Deep Learning from scratch] in Java 3. Neural network
Java character code
Pi in Java
Generate OffsetDateTime from Clock and LocalDateTime in Java
FizzBuzz in Java
[Java] Get KFunction from Method / Constructor in Java [Kotlin]
Try calling synchronized methods from multiple threads in Java
Delete All from Java SDK in Azure Cosmos DB
[Java] How to erase a specific character from a character string
Reverse Enum constants from strings and values in Java
Change the storage quality of JPEG images in Java
Call a program written in Swift from Processing (Java)
About full-width ⇔ half-width conversion of character strings in Java
[java] sort in list
Read JSON in Java
Interpreter implementation in Java
Call Java from JRuby
Rock-paper-scissors app in Java
Constraint programming in Java
Put java8 in centos7
Changes from Java 8 to Java 11
Sum from Java_1 to 100
NVL-ish guy in Java
Combine arrays in Java
"Hello World" in Java
Callable Interface in Java
Comments in Java source
Eval Java source from Java
Azure functions in java
Format XML in Java
Simple htmlspecialchars in Java
Boyer-Moore implementation in Java
Hello World in Java
Access API.AI from Java
Use OpenCV in Java
webApi memorandum in java
Type determination in Java
Ping commands in Java
Various threads in java
From Java to Ruby !!
Heapsort implementation (in java)
Zabbix API in Java