Object extraction in image by pattern matching using OpenCV with Python
When I looked into OpenCV when creating the above article, it seems that I can really do various things, so I decided to try what I came up with for the time being.
So, the procedure of "projection conversion" that corrects the image of a business card taken with a camera as if it was taken from the front is summarized below.
item | Contents |
---|---|
Machine | MacBook Air (13-inch, Early 2015) |
Processor | 2.2 GHz Intel Core i7 |
memory | 8 GB 1600 MHz DDR3 |
Python | 3.6.0 :: Anaconda 4.3.1 (x86_64) |
Jupyter Notebook | 4.2.1 |
OpenCV | 3.3.0-rc |
Please refer to the following URL for the usual front miso.
-Procedure to quickly create a deep learning environment on Mac with TensorFlow and OpenCV
-Procedure to quickly create a machine learning environment on Ubuntu 16.04
Use the back side of your business card taken with the iPhone camera. (IMG_4778.JPG)
python
import cv2
import numpy as np
from IPython.display import display, Image
def display_cv_image(image, format='.png'):
decoded_bytes = cv2.imencode(format, image)[1].tobytes()
display(Image(data=decoded_bytes))
python
img = cv2.imread("IMG_4778.JPG")
display_cv_image(img)
python
#Grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#Binarization
ret,th1 = cv2.threshold(gray,200,255,cv2.THRESH_BINARY)
display_cv_image(th1)
The result of binarization is as follows.
python
#Contour extraction
image, contours, hierarchy = cv2.findContours(th1, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#Sort only those with a large area
areas = []
for cnt in contours:
area = cv2.contourArea(cnt)
if area > 10000:
epsilon = 0.1*cv2.arcLength(cnt,True)
approx = cv2.approxPolyDP(cnt,epsilon,True)
areas.append(approx)
cv2.drawContours(img,areas,-1,(0,255,0),3)
display_cv_image(img)
The outline was extracted correctly and I was able to surround it with a red frame.
Projection transformation is performed for each point in the frame according to the corresponding coordinates.
python
img = cv2.imread("IMG_4778.JPG")
dst = []
pts1 = np.float32(areas[0])
pts2 = np.float32([[600,300],[600,0],[0,0],[0,300]])
M = cv2.getPerspectiveTransform(pts1,pts2)
dst = cv2.warpPerspective(img,M,(600,300))
display_cv_image(dst)
The converted result is as follows.
did it!
I tried to ORC the recognized business card characters using a library called tesseract-ocr.
Add the following code after the above source code.
python
import pyocr
from PIL import Image
tools = pyocr.get_available_tools()
tool = tools[0]
print(tool.image_to_string(Image.fromarray(dst), lang="jpn"))
Result is...
[Accusative case]
ヽ/Dimension Tweer Opening Technology Appetizer
Coastal news ‡
Type 3 Denki Chief Brancher
Otoshiki Sagi 4 Condyle Temari Temari Handling Clothes
Karate first stage
Nisuta
... there is room for improvement (sweat)
Recommended Posts