Hi, I'm Ramu. We will implement Otsu's binarization (discriminant analysis method), which is a method to automatically determine the threshold value used for binarization.
Binarization is the process of converting an image into a monochrome image with only two colors, black and white. After determining the threshold value, replace the pixel values below the threshold value with white and the pixels with pixel values above the threshold value with black. So far, I explained in the previous binarization. This time, we will deal with the method of automatically determining this threshold.
In Otsu's binarization, the class is divided into two according to the threshold. The threshold value when the degree of separation is maximum in these two classes is the threshold value when binarizing. The parameters required to calculate the degree of separation can be calculated by the following formula.
Separation: $ X = \ dfrac {\ sigma _ {b} ^ {2}} {\ sigma _ {w} ^ {2}} $
In-class distribution: $ \ sigma _ {b} ^ {2} = \ dfrac {\ omega _ {0} \ omega _ {1}} {(\ omega _ {0} + \ omega _ {1}) ^ 2 } (M _ {0} + M _ {1}) ^ 2 $
Distribution between classes: $ \ sigma _ {b} ^ {2} = \ omega _ {0} \ sigma _ {0} ^ {2} + \ omega _ {1} \ sigma _ {1} ^ {2} $
Number of pixels belonging to class 0,1: $ \ omega _0, \ omega _1 $
Variance of pixel values belonging to classes 0,1: $ \ sigma _0, \ sigma _1 $
Average pixel values belonging to classes 0,1: $ M_0, M_1 $
Average pixel value of the entire image: $ M $
Total pixel values belonging to class 0,1: $ P_0, P_1 $
In summary, when the threshold is 0 to 255, the degree of separation should be calculated 256 times to find the threshold value that maximizes the degree of separation.
otsuBinarization.py
import numpy as np
import cv2
import matplotlib.pyplot as plt
# from statistics import variance
import statistics as st
plt.gray()
def otsuBinarization(img):
#Image copy
dst = img.copy()
#Grayscale
gray = cv2.cvtColor(dst, cv2.COLOR_BGR2GRAY)
w,h = gray.shape
Max = 0
#Average pixel value of the entire image
M = np.mean(gray)
#Applies to all 256 threshold values
for th in range(256):
#Classification
g0,g1 = gray[gray<th],gray[gray>=th]
#Number of pixels
w0,w1 = len(g0),len(g1)
#Pixel value distribution
s0_2,s1_2 = g0.var(),g1.var()
#Pixel value average
m0,m1 = g0.mean(),g1.mean()
#Pixel value total
p0,p1 = g0.sum(),g1.sum()
#In-class distribution
sw_2 = w0*s0_2 + w1*s1_2
#Distribution between classes
sb_2 = ((w0*w1) / ((w0+w1)*(w0+w1))) * ((m0-m1)*(m0-m1))
#Separation
if (sb_2 != 0):
X = sb_2 / sw_2
else:
X = 0
if (Max < X):
Max = X
t = th
#Binarization
idx = np.where(gray < t)
gray[idx] = 0
idx = np.where(gray >= t)
gray[idx] = 255
return gray
#Image reading
img = cv2.imread('image.jpg')
#Binarization of Otsu
mono = otsuBinarization(img)
#Save image
cv2.imwrite('result.jpg', mono)
#Image display
plt.imshow(mono)
plt.show()
The left image is the input image, the center of the image is the output image when the threshold is manually set to 128, and the right image is the output image this time. Even if the threshold value is automatically determined and binarized, the image is output without much discomfort. As an aside, my implementation doesn't use the average pixel value M for the entire image.
If you have any questions, please feel free to contact us. imori_imori's Github has the official answer, so please check that as well. .. Also, since python is a beginner, please kindly watch over and comment on any mistakes.
Recommended Posts