Comparison of Japanese conversion module in Python3

Ciao ... †

Natural language processing is accompanied by preprocessing. Pre-processing has never been fast. So, I compared the Japanese conversion module in Python3.

Comparison items

Full-width / half-width conversion and conversion from hiragana to katakana. See both long and short target strings.

Comparison

-jaconv (A module I made. Recently renamed from jctconv) -Mohayonao's code

Comparison result

Click here for details

	jaconv	cnvk	mojimoji	zenhan	rfZenHan	mohayonao	nkf
Short sentences from half-width to full-width	27.1 µs	96.4 µs	5.04 µs	75.8 µs	222 µs		23 µs
Long sentences half-width → full-width	89.9 ms	38.6 ms	23.1 ms	360 ms	237 ms		95.4 ms
Short sentences in hiragana → katakana	18.1 µs	79.1 µs				25.4 µs	23.2 µs
Long sentences in hiragana → katakana	51.6 ms	41.8 ms				246 ms	98.6 ms

As I use Cython, mojimoji is fast. In Pure Python, jaconv has good performance in short sentences, and cnvk seems to be good in long sentences.

Recommended Posts

Comparison of Japanese conversion module in Python3

Python executable file conversion module comparison 2

Conversion of string <-> date (date, datetime) in Python

Japanese output in Python

I wrote python in Japanese

Null object comparison in Python

Automatic update of Python module

Store Japanese (multibyte character string) in sqlite3 of python

[python] Get the list of classes defined in the module

Equivalence of objects in Python

Comparison of 4 Python web frameworks

I understand Python in Japanese!

Implementation of quicksort in Python

Sample of getting module name and class name in Python

Get Japanese synonyms in Python

Comparison of exponential moving average (EMA) code written in Python

Comparison of how to use higher-order functions in Python 2 and 3

Solve the Japanese problem when using the CSV module in Python.

Comparison of data frame handling in Python (pandas), R, Pig

Pixel manipulation of images in Python

Division of timedelta in Python 2.7 series

MySQL-automatic escape of parameters in python

Handling of JSON files in Python

Implementation of life game in Python

Waveform display of audio in Python

Python unittest module execution in vs2017

Law of large numbers in python

Implementation of original sorting in Python

Speed comparison of Python XML parsing

Reversible scrambling of integers in Python

How to handle Japanese in Python

Master the weakref module in Python

Check the behavior of destructor in Python

(Bad) practice of using this in Python

General Theory of Relativity in Python: Introduction

Output tree structure of files in Python

(Java, JavaScript, Python) Comparison of string processing

Pass the path of the imported python module

Display a list of alphabets in Python 3

Implementation module "deque" in queue and Python

Make a relation diagram of Python module

Summary of various for statements in Python

python string comparison / use'list'and'in' instead of'==' and'or'

Playing card class in Python (with comparison)

Test of uniqueness in paired comparison method

The result of installing python in Anaconda

Comparison of solutions in weight matching problems

Gang of Four (GoF) Patterns in Python

Check the path of the Python imported module

Module to generate word N-gram in Python

R: Use Japanese instead of Japanese in scripts

The basics of running NoxPlayer in Python

Bulk replacement of strings in Python arrays

Project Euler # 16 "Sum of Powers" in Python

Traffic Safety-kun: Recognition of traffic signs in Python

Conversion of time data in 25 o'clock notation

ModuleNotFoundError in Python: No module named story

Summary of built-in methods in Python list

Non-logical operator usage of or in python

In search of the fastest FizzBuzz in Python

Python: Preprocessing in machine learning: Data conversion