--Indicator to compare two strings ――How many times can you convert from s1 to s2 by performing the following three operations? --Replace --Insert
--Consider converting a string from s1 to s2
s1 = "aaa"
s2 = "aab"
--If the character string is as above, the edit distance is 1. --One replacement
s1 = "aba"
s2 = "cc"
--If the character string is as above, the edit distance is 3 --Replacement twice --One insertion
There is a very useful python module python-Levenshtein
The official documentation is here
Install with pip
$ pip install python-Levenshtein
leven.py
import Levenshtein
import sys
args = sys.argv
with open(args[1], "r") as f_ans:
with open(args[2], "r") as f_ref:
s_ans = f_ans.read()
s_ref = f_ref.read()
print(Levenshtein.distance(s_ans, s_ref))
It's a simple program that just spits out the edit distance by specifying two text files as command line arguments.
Prepare two text files.
tmp1.txt
Helllo worb!!
tmp2.txt
Hello world!
To convert from tmp1.txt to tmp2.txt,
--Deleted one "l" from "Helllo" --About "worb" --Added "l" --Replace "b" with "d" -Delete one "!"
So the edit distance should be 4.
$ python leven.py tmp1.txt tmp2.txt
4
became. happy.
This is an article written by a person who tried to implement it by dynamic programming, but found that the module was missing and it became troublesome to implement. If you have any questions, please leave them in the comments.
Recommended Posts