I tried OCR processing a PDF file with Java part2

At the beginning

This is a sequel to I tried OCR processing a PDF file with Java. I mainly write about tess4j 4.1

Purpose of this article

Even if I google about tess4j 4.1, I do not get much information, so I will write how to move it and the result of moving it If you only use the information on the net, you will get a run-time error.

Modification place

Here is the changed part from I tried OCR processing of PDF file with Java.

gradle file

compile group: 'net.sourceforge.tess4j', name: 'tess4j', version: '4.1.1'

I will describe the module dependency with a feeling tessdata/configs/api_config

textord_tabfind_vertical_horizontal_mix T

Will be added. Without this description, a run-time error will occur. jpn.traineddata Overwrite with the learning data downloaded from GitHub

Run

Just run it from Gradle with the run command

Execution result

processing speed

I compared the execution results of 3 series and 4 series with Win10pro iCore5 2.2GH memory 16G by the conversion process of "2016 Spring Information Security Supporter Examination 2 pm" 4 series about 2.5 minutes 3 series about 8 minutes 4 series is overwhelmingly faster

Character recognition

In the 3rd series, the misconversion rate was unreasonably high if Japanese and English characters were mixed, but in the 4th series, this was dramatically improved. For example, in 3 series

Q-What are the characteristics of Pus S?
By the bell chief,The number of stages is decided.

The part that was converted to is 4 series

Q (1) What are the characteristics of AES?
By the key size,The number of stages is decided.

It is designed to be properly converted to meaningful characters

Recommended Posts

I tried OCR processing a PDF file with Java part2
I tried OCR processing a PDF file with Java
Read a string in a PDF file with Java
I tried to break a block with java (1)
I tried to create a java8 development environment with Chocolatey
I tried to modernize a Java EE application with OpenShift.
I created a PDF in Java.
I tried to interact with Java
I tried UDP communication with Java
Server processing with Java (Introduction part.1)
I tried playing with BottomNavigationView a little ①
I tried using OpenCV with Java + Tomcat
I tried learning Java with a series that beginners can understand clearly
I made an app to scribble with PencilKit on a PDF file
[iOS] I tried to make a processing application like Instagram with Swift
I tried to make Basic authentication with Java
java I tried to break a simple block
I tried hitting a Java method from ABCL
I tried running Java on a Mac terminal
I tried to make a machine learning application with Dash (+ Docker) part3 ~ Practice ~
[Java] I tried to connect using a connection pool with Servlet (tomcat) & MySQL & Java
I tried to implement file upload with Spring MVC
I tried to implement TCP / IP + BIO with JAVA
[Java 11] I tried to execute Java without compiling with javac
Export pdf with a single program (Java / Perl / VBA)
I tried to create a Clova skill in Java
I tried to make a login function in Java
I tried using Log4j2 on a Java EE server
I tried to implement Stalin sort with Java Collector
I tried scraping a stock chart using Java (Jsoup)
I tried to create a shopping site administrator function / screen with Java and Spring
[Azure] I tried to create a Java application for free ~ Connect with FTP ~ [Beginner]
I wrote a CRUD test with SpringBoot + MyBatis + DBUnit (Part 1)
I tried DI with Ruby
I tried to increase the processing speed with spiritual engineering
[Rails] I tried to create a mini app with FullCalendar
I want to make a list with kotlin and java!
I want to make a function with kotlin and java!
[Java] Create a temporary file
I tried Drools (Java, InputStream)
[Rails] I tried to implement batch processing with Rake task
A memo when I tried "Talking about writing a Java application in Eclipse and publishing it in Kubernetes with a Liberty container (Part 1)"
Even in Java, I want to output true with a == 1 && a == 2 && a == 3
I tried to convert a string to a LocalDate type in Java
I tried using Java REPL
I tried UPSERT with PostgreSQL.
Easy to make LINE BOT with Java Servlet Part 2: I tried image messages and templates
I tried to touch JavaScript Part.1 Basic processing code system
About the behavior when doing a file map with java
I tried to make a client of RESAS-API in Java
I tried BIND with Docker
I tried to create a padrino development environment with Docker
I tried using the CameraX library with Android Java Fragment
I tried metaprogramming in Java
I tried to make a machine learning application with Dash (+ Docker) part2 ~ Basic way of writing Dash ~
I tried printing a form with Spring MVC and JasperReports 1/3 (JasperReports settings)
A story that I struggled to challenge a competition professional with Java
I tried printing a form with Spring MVC and JasperReports 3/3 (Spring MVC control)
I tried Tribuo published by Oracle. Tribuo --A Java prediction library (v4.0)
I tried running a letter of credit transaction application with Corda 1
I can't create a Java class with a specific name in IntelliJ