Image denoising

Overview

Implemented Chapter 14 "Image Noise Removal" for sparse modeling.

Was compared.

notebook ch14-01.ipynb ch14-02.ipynb ch14-03.ipynb ch14-04.ipynb ch14-05.ipynb

result

ch14-denoise.png The numbers are [peak signal to noise ratio (PSNR)](https://ja.wikipedia.org/wiki/peak signal to noise ratio) [db]

K-SVD Although it takes a lot of calculation time, NL-means , BM3D was stronger and overgrown. Maybe it's implemented ...

Method

Test image

Gaussian noise with σ = 20 was added to Barbara and used as a test image. Noise was removed by each method. barbara_sig20.png

Wavelet reduction

Wavelet transform was performed and hard threshold processing was performed. Performance changed depending on the threshold. wavelet_shrinkage_threshold.png wavelet_shrinkage.png

Duplicate patch-based DCT reduction

An 8x8 patch was extracted from the image. The patch was DCT transformed and hard thresholded. The average of the patch overlaps was taken. Performance changed depending on the threshold.

dct_shrinkage_threshold.png dct_shrinkage.png

Reduction curve learning

The threshold processing can be regarded as a curve showing the relationship between the input value and the output value. By polynomial fitting, the optimum reduction curve was learned from a pair of patches with and without noise. $ F_{local}(S) = \Sigma_{k=1}^{M}||p_{k}^{0}-AS\\{A^{T}p_{k}\\}||^{2}_{2}$

S is threshold processing, $ A ^ {T} $ is DCT transform, $ p ^ {0} $ is noise-free patch, and M is the total number of training data. The parameter of S (polynomial coefficient) that minimizes $ F_ {local} $ was calculated by the least squares method.

Learn the reduction curve for each element of the DCT. The patch size was $ 6 \ times 6 $. Since the non-redundant DCT is used, the number of elements after DCT is also $ 6 \ times 6 $. The number of reduction curves is 36.

Training data for reduced curve learning

lena_200_200.png Patches were extracted from the $ 200 \ times 200 $ area of lena and used as training data. Standardized by subtracting 127 and dividing by 128.

result

The curve of each cell represents the reduction curve for each DCT coefficient. c_local.png recon_dct_shrinkage_curve.png

Global reduction curve learning

F_{global}(S) = ||y_{0} - \frac{1}{n}\Sigma_{k=1}^{M}R_{k}^{T}AS\\{A^{T}p_{k}\\}||^{2}_{2}

Find the parameters of the reduction curve that minimizes. $ R_ {k} $ is the operator that extracts the kth patch from the image. c_global.png recon_dct_global_shrinkage_curve.png The slope of the reduction curve is almost 0, but it seems to be usable for the time being ... Since the DC component becomes 0, it is scaled in post-processing. (Implementation may be strange ...)

OMP noise removal with redundant DCT dictionary

An 8x8 patch was extracted from the image. The patch was redundantly DCT transformed and sparse-encoded by OMP. The average of the patch overlaps was taken. We took a noisy image and a weighted average.

The number of non-zero elements in the sparse representation obtained by OMP is $ k_0 = 4 $. OMP tolerance $ \ epsilon = 8 ^ 2 \ times 20 ^ 2 \ times 1.15 $. A weighted average was taken with a weight of 0.5 for the noisy image and a weight of 1 for the noisy image.

Redundant DCT dictionary Convert a $ 8 \ times 8 $ patch to a $ 16 \ times 16 $ component A_DCT.png recon_dct_dictionary.png

OMP noise removal with K-SVD dictionary

A patch was extracted from the noisy image, and a dictionary was obtained by K-SVD. Using the obtained dictionary, processing was performed in the same manner as above.

K-SVD dictionary A_KSVD_sig20.png recon_ksvd_dictionary.png

NL-means Buades et al.'S famous NL-means From the perspective of dictionary learning, NL-means can be seen as extreme dictionary learning, with different dictionaries for each pixel.

Consider a search window centered on the pixel of interest. Think of a set of patches centered on each pixel in the search window as a dictionary. The coefficient of each atom is calculated based on the square error with the patch centered on the pixel of interest (patch of interest). This is a close expression of the patch of interest.

From this point of view, dictionary learning and NL-means can be improved respectively. recon_nlm.png

BM3D The famous BM3D by Dabov et al. It is the strongest against Gaussian noise. BM3D can also be seen from the perspective of dictionary learning.

By block matching (BM), patches similar to the patch of interest are collected in the search window and stacked to form a 3D patch. 3D patch is transformed (wavelet, DCT, etc.), hard threshold processing, Wiener reduction, and noise removal (collaborative filtering).

It leads to structured dictionary learning and a combination of clustering and dictionary learning. recon_bm3d.png

Summary

reference

text.jpg

Recommended Posts

Image denoising
Image recognition
Image crawler
[AI] Deep Learning for Image Denoising
[Image processing] Posterization
[Note] Image resizing
Image blur removal
Image collection method
Inflated learning image
Image reading memo
Normalize image brightness
First image classifier
Image processing 100 knocks ①
Image of closure