I tried to correct the keystone of the image

Motivation

――In the shopping mall in the neighborhood, many plants are planted in the outer passage, and there is a plate with a description of the plants together. I was wondering what kind of plants (I think there are more than 100 kinds), so I decided to aggregate the information on the plate. ――However, it is troublesome to input the information on the plate into your smartphone and collect the information. .. .. I decided to take a picture, correct it, and then perform OCR for the sake of simplicity.

Image before correction

What i did

  1. Acquisition of plate image
  2. Mask and binarize parts other than the plate
  3. Noise removal
  4. Get / approximate contour Approximate
  5. Plate image correction
  6. Keystone correction

1. Acquisition of plate image

1. Mask and binarize parts other than the plate

The image is converted to HSV format, and what Hue (hue) is included in the range of the plate is output as white, and the others are output as black, and masking and binarization are performed.

preprocess.py


cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

The mask looks good.

Supplement

HSV is a method of expressing color with three elements.

--Hue --Color types (eg red, blue, yellow) --Saturation --Color vividness --Brightness --Color brightness

2. Noise removal

After closing, opening Closing first because the pixels of the plate may be missing

Supplement

--Dilation --If there is even one white pixel around the pixel of interest, replace the pixel of interest with white. --Erosion --If there is even one black pixel around the pixel of interest, replace the pixel of interest with black. --Closing --Processing that expands and contracts the same number of times --Opening --Processing that contracts and expands the same number of times

3. Obtain contour ・ Approximate plate contour

Get contour

Detects and draws the contours of objects in the denoised image.

preprocess.py


img, contours, hierarchy = cv2.findContours(img, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPL)
img = cv2.drawContours(img, contour, -1, (0, 0, 255), 30)
cv2.imwrite(output_path, img)

The outline was drawn for this.

Get the contour of the plate

In this image, there is no outline other than the plate, Get the contour (plate) with the largest area in case there is a contour other than the plate.

preprocess.py


contour_areas = {}
for i, contour in enumerate(contours):
    area = cv2.contourArea(contour)
    contour_areas[i] = area

max_area = max(contour_areas.values())
max_area_idx = [i for i, v in contour_areas.items() if v == max_area][0]
max_contour = contours[max_area_idx]

Get the coordinates of the four corners of the plate

Approximate the shape of the area with a small number of points and get the coordinates of the four corners of the plate.

preprocess.py


arc_len = cv2.arcLength(max_contour, True)
approx_contour = cv2.approxPolyDP(max_contour, epsilon=0.1 * arc_len, closed=True)
img = cv2.drawContours(img, approx_contour, -1, (0, 0, 255), 30)
cv2.imwrite(output_path, img)

I was able to get the coordinates of the four corners with a good feeling.

2. Plate image correction

1. Keystone correction

The coordinates of the four corners are divided into upper left, lower left, upper right, and lower right, and keystone correction is performed to draw the image.

preprocess.py


approx = approx_contour.tolist()

left = sorted(approx, key=lambda x: x[0])[:2]
right = sorted(approx, key=lambda x: x[0])[2:]
left_down = sorted(left, key=lambda x: x[0][1])[0]
left_up = sorted(left, key=lambda x: x[0][1])[1]
right_down = sorted(right, key=lambda x: x[0][1])[0]
right_up = sorted(right, key=lambda x: x[0][1])[1]

perspective_base = np.float32([left_down, right_down, right_up, left_up])
perspective = np.float32([[0, 0], [700, 0], [700, 500], [0, 500]])

psp_matrix = cv2.getPerspectiveTransform(perspective_base, perspective)
plate_img = cv2.warpPerspective(org_img, psp_matrix, (700, 500))
cv2.imwrite(output_path, img)

Keystone correction result

Summary

Stumble

――At first, straight line detection was performed on the image before correction, but even if the line of the plant was acquired or the straight line of the plate was acquired, it was cut off, which was not the intended result. ――After that, I tried to mask based on the RGB values, but I couldn't detect the outline of the plate well because most of the plate was missing or the part unrelated to the plate could not be masked.

Good thing

――Since I started by groping completely, I didn't know which method to use, and as a result, I was able to try various image processing methods. (Although the theory has not caught up.) --Binarization (fixed value, Otsu, adaptive threshold processing) --Line detection (Hough transform, stochastic Hough transform) --Edge detection (Canny method, LSD) --Smoothing (moving average, Gaussian)

Task

If plants overlap the plate and the area of the plate is divided as shown in the image below, the keystone correction will not work. (I wonder if the plants will not cover the plate when taking pictures ...)

ending

By the way, in Google Cloud Vision, the OCR result is almost the same before and after correction, and OCR is almost perfect even before correction. Google's style! !! !!

Other

Click here for source code https://github.com/ChihiroHozono/Plate-Text-Detector

Recommended Posts

I tried to correct the keystone of the image
I tried to find the entropy of the image with python
I tried to build the SD boot image of LicheePi Nano
I tried to touch the API of ebay
I tried using the image filter of OpenCV
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to transform the face image using sparse_image_warp of TensorFlow Addons
I tried to get the batting results of Hachinai using image processing
I tried to detect the iris from the camera image
I tried to summarize the basic form of GPLVM
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to classify the voices of voice actors
I tried to compress the image using machine learning
I tried to summarize the string operations of Python
[Python] I tried to judge the member image of the idol group using Keras
I tried to automate the face hiding work of the coordination image for wear
I tried to move the ball
I tried to estimate the interval.
[Horse Racing] I tried to quantify the strength of racehorses
I tried "gamma correction" of the image with Python + OpenCV
I tried to get the location information of Odakyu Bus
I tried to find the average of the sequence with TensorFlow
[Python] I tried to visualize the follow relationship of Twitter
[Machine learning] I tried to summarize the theory of Adaboost
I tried to fight the Local Minimum of Goldstein-Price Function
I tried to sort out the objects from the image of the steak set meal-④ Clustering
I tried to extract the text in the image file using Tesseract of the OCR engine
I tried the asynchronous server of Django 3.0
I tried to summarize the umask command
I tried to recognize the wake word
I tried to summarize the graphical modeling.
I tried to estimate the pi stochastically
I tried to touch the COTOHA API
I tried playing with the image with Pillow
[Linux] I tried to summarize the command of resource confirmation system
I tried to get the index of the list using the enumerate function
I tried to process the image in "sketch style" with OpenCV
I tried to process the image in "pencil style" with OpenCV
I tried to expand the size of the logical volume with LVM
I tried to cut out a still image from the video
I tried to summarize the frequently used implementation method of pytest-mock
I tried to improve the efficiency of daily work with Python
I tried to visualize the common condition of VTuber channel viewers
I tried to sort out the objects from the image of the steak set meal-① Object detection
I tried "smoothing" the image with Python + OpenCV
I tried web scraping to analyze the lyrics.
I tried the pivot table function of pandas
I tried cluster analysis of the weather map
I tried image recognition of CIFAR-10 with Keras-Learning-
I tried moving the image to the specified folder by right-clicking and left-clicking
I tried to visualize the age group and rate distribution of Atcoder
I tried transcribing the news of the example business integration to Amazon Transcribe
I tried "differentiating" the image with Python + OpenCV
I tried image recognition of CIFAR-10 with Keras-Image recognition-
I tried to notify slack of Redmine update
I tried to optimize while drying the laundry
zoom I tried to quantify the degree of excitement of the story at the meeting
I tried to estimate the similarity of the question intent using gensim's Doc2Vec
I tried to save the data with discord