When I was looking for an article that uses Selenium, I found an article about automating sushi. The method is basically as follows ・ When you start the game, keep entering all the keys ・ When you start the game, take a screenshot and enter the character string obtained by OCR.
This time I tried simple image processing using OpenCV as OCR part and pre-processing
tesseract is an OCR engine. This time I will run this OCR engine with python's pyocr module Installation is completed with the following command
$ brew install tesseract
Since there is no test data for Japanese as it is, download it from the following URL https://github.com/tesseract-ocr/tessdata ↑ Download jpn.traineddata from this URL to / usr / local / share / tessdata /
Execute the following command in the terminal to complete
$ pip3 install pyocr
$ pip3 install opencv-python
The image for the test is below ↓ Trimming
Save the trimmed version as test.png
import cv2
import pyocr
from PIL import Image
image = "test.png "
img = cv2.imread(image)
tools = pyocr.get_available_tools()
if len(tools) == 0:
print("No OCR tool found")
sys.exit(1)
tool = tools[0]
res = tool.image_to_string(
Image.open("test.png ")
,lang="eng")
print(res)
Execution result Not recognized correctly at all ... After all it seems that pre-processing is necessary
I want to preprocess with OpenCV, but I'm new to OpenCV so I'll play with it Try to process your own icon image
import sys
import cv2
import pyocr
import numpy as np
from PIL import Image
image = "test_1.png "
name = "test_1"
#original
img = cv2.imread(image)
#gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imwrite(f"1_{name}_gray.png ",img)
#goussian
img = cv2.GaussianBlur(img, (5, 5), 0)
cv2.imwrite(f"2_{name}_gaussian.png ",img)
#threshold
img = cv2.adaptiveThreshold(
img
, 255
, cv2.ADAPTIVE_THRESH_GAUSSIAN_C
, cv2.THRESH_BINARY
, 11
, 2
)
cv2.imwrite(f"3_{name}_threshold.png ",img)
The image in the processing process looks like this
OpenCV + OCR Preprocess the image used in OCR with OpenCV and try OCR again In the following, grayscale → threshold processing → color inversion is performed as preprocessing.
import sys
import cv2
import pyocr
import numpy as np
from PIL import Image
image = "test.png "
name = "test"
#original
img = cv2.imread(image)
cv2.imwrite(f"1_{name}_original.png ",img)
#gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imwrite(f"2_{name}_gray.png ",img)
#threshold
th = 140
img = cv2.threshold(
img
, th
, 255
, cv2.THRESH_BINARY
)[1]
cv2.imwrite(f"3_{name}_threshold_{th}.png ",img)
#bitwise
img = cv2.bitwise_not(img)
cv2.imwrite(f"4_{name}_bitwise.png ",img)
cv2.imwrite("target.png ",img)
tools = pyocr.get_available_tools()
if len(tools) == 0:
print("No OCR tool found")
sys.exit(1)
tool = tools[0]
res = tool.image_to_string(
Image.open("target.png ")
,lang="eng")
print(res)
Execution result
It seems that you can recognize it well! This time it's over
Recommended Posts