A story about image processing and grayscale conversion only by matrix calculation without relying on the image processing library. Also possible with Pythonista
** Click here for basics **
Instead of relying on Open CV or Pillow, I will actually write various image processing using numpy and matplotlib. It's a combination that can also be used with the iOS app Pythonista.
import numpy as np
import matplotlib.pyplot as plt
In addition, the following functions are convenient for displaying images. (For details, Basics)
def img_show(img : np.ndarray, cmap = 'gray', vmin = 0, vmax = 255, interpolation = 'none') -> None:
'''np.Display an image with array as an argument.'''
#Set dtype to uint8
img = np.clip(img,,vmin,vmax).astype(np.uint8)
#Display image
plt.imshow(img, cmap = cmap, vmin = vmin, vmax = vmax, interpolation = interpolation)
plt.show()
plt.close()
Grayscale is a method of calculating the black and white value Y from the RGB values assigned to each pixel. Here, various grayscale methods that were not dealt with in Basics ) Also try. See the link for a detailed explanation. They are treated in the same order.
The image used is'tiger.jpeg'
img = plt.imread('tiger.jpeg')
R, G, B = img[...,0], img[...,1], img[...,2]
A function for comparing and arranging color and black and white is defined here.
def align_show(img_gray):
#img_gray to N*M*Convert to 3
img_pseudogray = np.einsum('ij,k->ijk',img_gray,[1,1,1])
#Display side by side
img_show(np.concatenate((img,img_pseudogray), axis = 1))
$ \ rm Y = \ frac {\ max (R, G, B) + \ min (R, G, B)} {2} $. It is a method. In short, the average of the maximum and minimum values. In the actual calculation, the order of calculation is partially changed from the above formula to avoid overflow.
img_mid_v = np.max(img, axis = 2)/2 +np.min(img, axis = 2)/2
img_show(img,img_mid_v)
There is no problem at first glance, but the button (?) In the middle of the maze is hard to see.
$ \ rm Y = (0.298912 R + 0.586611 G + 0.114478 B) $. These coefficients are the result of taking into account the effect of each RGB on the human eye (psychological weighting).
img_ntsc = (0.298912 * R + 0.586611 * G + 0.114478 * B)
align_show(img_ntsc)
Goodness
$ \ rm Y = ((0.222015 R) ^ X + (0.706655 G) ^ X + (0.071330 B) ^ X) ^ {1 / X} $. This also incorporates psychological weighting.
X = 2.2
img_hdtv = ((0.222015*R)**X + (0.706655*G)**X + (0.071330*B)**X)**(1/X)
align_show(img_hdtv)
It's so different that I can't tell the difference from the NTSC method.
This is a method of averaging $ \ rm Y = \ frac {R + G + B} {3} $. Probably the most intuitive way. It may be said that it is the NTSC method before taking the weighted average.
img_mean = np.mean(img)
align_show(img_mean)
How to extract only the G channel $ \ rm Y = G $. It seems to be the fastest.
img_g_channel = G
align_show(img_g_channel)
I wonder if the red part is a little too dark ...
How to retrieve the median $ \ rm Y = median (R, G, B) $. I feel that the idea is similar to the median method.
img_median = np.median(img, axis = 2)
align_show(img_median)
As with the median method, the green is too dark.
Median | NTSC | HDTV |
---|---|---|
The original image | ||
Simple average | G channel | Median |
Recommended Posts