Understand Fisher's linear discriminant analysis and be able to binarize images using it.
Software name | version |
---|---|
Python | 3.4 or 3.5 |
Pillow | 3.1 |
Numpy | 1.10 |
Matplotlib | 1.5.0 |
Find the parameter $ w $ that maximizes the function below.
\begin{equation}
J(w) = \frac{w^{T}S_{B}w}{w^{T}S_{W}w}
\end{equation}
Each class when $ N $ data is classified into $ k $ classes so that the number of data in each class is $ N_ {i} (i = 0, ..., k-1) $ The mean of the covariance matrix of.
\begin{equation}
S_{W}=\sum_{i=0}^{k-1}\frac{N_{i}}{N}S_{i}
\end{equation}
A covariance matrix of the mean vectors for each class.
\begin{equation}
S_{B}=\sum_{i=0}^{k-1}\frac{N_{i}}{N}(\mu_{i}-\mu)^{T}(\mu_{i}-\mu)
\end{equation}
The sum of the intraclass covariance matrix ($ S_ {W}
\begin{align}
S_{W} + S_{B} &=\sum_{i=0}^{k-1}\frac{N_{i}}{N}S_{i}+\sum_{i=0}^{k-1}\frac{N_{i}}{N}(\mu_{i}-\mu)^{T}(\mu_{i}-\mu)\\
&=\frac{1}{N}\sum_{i=0}^{k-1}(N_{i}(\sum_{j \in C_{i}}\frac{x_{j}^{T}x_{j}}{N_{i}}-\mu_{i}^{T}\mu_{i})+N_{i}\mu_{i}^{T}\mu_{i})-\mu^{T}\mu\\
&=\frac{1}{N}\sum_{i=0}^{k-1}(\sum_{j \in C_{i}}x_{j}^{T}x_{j})-\mu^{T}\mu\\
&=\frac{1}{N}\sum_{l=0}^{N-1}x_{l}^{T}x_{l}-\mu^{T}\mu\\
&=S
\end{align}
Even if you replace $ w $ in the objective function $ J (w) = \ frac {w ^ {T} S_ {B} w} {w ^ {T} S_ {W} w} $ with $ \ alpha w $ Since the value does not change, it can be replaced with $ w ^ {T} S_ {W} w = 1 $, and the problem of finding the parameter $ w $ that maximizes $ J (w) $ is $ w ^ {T. We come down to the problem of finding $ w $ that maximizes $ w ^ {T} S_ {B} w $ under the constraint of} S_ {W} w = 1 $.
\begin{equation}
argmax_{w}(w^{T}S_{B}w),
\hspace{3mm}subject\hspace{3mm}to\hspace{3mm}
w^{T}S_{W}w=1
\end{equation}
Convert this to a minimization problem.
\begin{equation}
argmin_{w}(-\frac{1}{2}w^{T}S_{B}w),
\hspace{3mm}subject\hspace{3mm}to\hspace{3mm}
w^{T}S_{W}w=1
\end{equation}
Then, the Lagrangian becomes as follows.
\begin{equation}
L=-\frac{1}{2}w^{T}S_{B}w+\frac{1}{2}\lambda (w^{T}S_{W}w - 1)
\end{equation}
Differentiate $ L $ with $ w $ and set it as $ 0 $ to get the following equation. This is called the generalized eigenvalue problem.
\begin{equation}
S_{B}w=\lambda S_{W}w
\end{equation}
Assuming $ S_ {W} $ is regular, multiply both sides by its inverse matrix $ S_ {W} ^ {-1} $ from the left to get the usual eigenvalue problem.
\begin{equation}
S_{W}^{-1}S_{B}w=\lambda w
\end{equation}
The eigenvector corresponding to the largest of the eigenvalues obtained as a result of solving the generalized eigenvalue problem is adopted as $ w $. This selection criterion is obtained by considering the dual problem of this optimization problem.
The objective function $ J (w) $ of Fisher's discriminant in the previous chapter is the following equation using the relationship of intraclass variance, interclass variance, and total variance, noting that the grayscale image data is one-dimensional. To get. Note that the parameter $ w $ is one-dimensional.
\begin{equation}
J(w) = \frac{\sigma_{B}}{\sigma_{W}}=\frac{\sigma_{B}}{\sigma -\sigma_{B}}
\end{equation}
Since the total variance $ \ sigma $ is constant, we can determine the threshold $ t $ that makes up $ \ sigma_ {B} $ that maximizes $ J (w) $.
$ \ sigma_ {B} $ can be calculated by the following equation.
\begin{equation}
\sigma_{B}=\frac{N_{0}}{N}(\mu_{0}-\mu)^2 + \frac{N_{1}}{N}(\mu_{1}-\mu)^{2}
\end{equation}
If the minimum and maximum pixel values of a grayscale image are $ x_ {min} $ and $ x_ {max} $, respectively, then $ x_ {min} \ leq t \ leq x_ {max} $. Calculate $ \ sigma_ {B} $ for all such $ t $, find the one that maximizes $ J (w) $, and use it as the threshold.
# This file is part of "Junya's self learning project about digital image processing."
#
# "Junya's self learning project about digital image processing"
# is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# "Junya's self learning project about digital image processing"
# is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Foobar. If not, see <http://www.gnu.org/licenses/>.
#
# (c) Junya Kaneko <[email protected]>
from PIL import Image
import numpy as np
from matplotlib import pyplot as plt
def discriminant_analysis(image):
_image = image.reshape(image.shape[0] * image.shape[1])
ave = np.average(_image)
t = j0 = 0
for _t in range(np.min(_image) + 1, np.max(_image) + 1):
blacks = _image[_image < _t]
whites = _image[_image >= _t]
sigma_b = len(blacks) * np.power(np.average(blacks) - ave, 2) /len(_image) + len(whites) * np.power((np.average(whites) - ave), 2) / len(_image)
j1 = sigma_b / (np.var(_image) - sigma_b)
if j0 < j1:
t = _t
j0 = j1
return t
if __name__ == '__main__':
image = np.array(Image.open('img/test.jpg').convert('L'), dtype=np.uint8)
bin_image = np.array(image)
t = discriminant_analysis(image)
for i in range(image.shape[0]):
for j in range(image.shape[1]):
bin_image[i, j] = 255 if image[i, j] >= t else 0
plt.subplot(1, 2, 1)
plt.title('Original')
plt.imshow(image, cmap='Greys_r')
plt.subplot(1, 2, 2)
plt.title('Binary')
plt.imshow(bin_image, cmap='Greys_r')
plt.show()
The source code can be found on Github.
The execution result is as shown in the figure below. The original image is [Dictionary Wiki](http://ja.characters.wikia.com/wiki/%E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB: % E4% B8% 89% E4% BD% 93% E7% BF% 92% E5% AD% 97% E3% 83% BB% E6% A5% B7_-_% E6% 88% 91.jpg) used.
Recommended Posts