** DeNA 20 New graduate Advent Calendar 2019 article. ** **
I will be a web engineer at DeNA from April next year, but I usually do research on light at graduate school. When I briefly introduce my research, I change the shape of the laser and shoot it, or use light to measure objects (By the way, when I display my Qiita account image on something called SLM and hit the laser, The laser branches into 100).
In this article, since I am studying light, I would like to introduce a ** "single pixel camera" ** that can be easily implemented using Python (anything can be implemented if matrix calculation is possible). So you can easily implement it in MATLAB or Octave)
Before talking about the principle of a single pixel camera, let's start with the mechanism of a normal camera. To put it bluntly, a normal camera uses a CCD, which is lined with a lot of ** detectors (which can detect the intensity of light) **.
For example, if you have a camera that can shoot 100 x 100 pixel images, 100 x 100 detectors are lined up.
Then, the light reflected from the object is detected by each detector, and the intensity of the detected light is converted into a pixel value to create an image.
Then, what about a single-pixel camera? Only one detector can take an image. To say a little more ** "Camera (method) that can reconstruct an image by matrix calculation after measuring multiple times using one detector" ** (It's called a single pixel because it can be shot with one detector).
I think that most people don't get the actual image even if they listen to this much. It's okay. The mechanism of the one-pixel camera itself is really simple. ~~ The reason why I chose it as the material for the article seems to be easy to explain. .. .. ~~
The figure below shows the actual shooting with a single pixel camera. Here, Scene is the subject of photography. The lens is used to collect the light in the detector. As shown in the above figure, all the light coming from the object passes through the mask and is collected and detected by one detector.
The mask pattern that suddenly appeared here. This is important. Thanks to this guy, the one-pixel camera can shoot with only one detector.
** "Detect the light intensity after the light coming from the object passes through the mask pattern" (the order of the mask and the shooting target does not matter). ** **
By changing the mask pattern prepared in advance one after another and repeating the above work, shooting with a single pixel camera is performed.
Here, the state of the i-th light intensity detection Assuming that the mask pattern (encoding matrix) is Wi, the shooting target is X, and the detection value is Yi, Wi can be transformed into (1, N) and X can be transformed into (N, 1) as shown below.
It looks like. Here, N is the number of pixels of the mask, and this N is the number of pixels to be photographed as it is.
As shown in the above figure, the i-th measurement result Yi is the inner product of Wi and X.
One line of W and Y is the mask and measurement result used in each measurement time, respectively. Now X, W, Y
It is in the state.
And Y
This is the shooting principle of a single pixel camera.
One of the merits of using a single-pixel camera for this kind of work is that you can shoot even with extremely weak light **.
The reason why it is possible to shoot even with extremely weak light is that a single pixel camera ** collects and detects light in one detector **.
For example, if $ 10 $ of light is coming from the subject, a 10x10 pixel camera has a separate detector for each pixel, so the amount of light that can be detected by each detector is $ 10 / (10x10) $. However, since the one-pixel camera collects the light in one detector and detects it, the amount of light remains at 10 (although it actually passes through the mask, so it drops to 4 or 5).
Therefore, it is possible to shoot even with an amount of light that cannot be shot with a normal camera.
The disadvantage is that it takes multiple measurements to take a single image. Therefore, it is not suitable for capturing "a moment" or for shooting movies.
Now, let's finally implement the flow of a single pixel camera with Pyhon. Well, even if you say implementation, as mentioned in the principle, you are only doing matrix calculation, so you just let Python do the calculation.
First, import the library to be used. Use numpy
to handle matrices and ʻopenCV` to handle images.
import numpy as np
import cv2
After that, the image to be shot is read. Also, obtain the vertical and horizontal sizes of the image, and calculate the number of pixels N of the mask from it.
X = cv2.imread("img/target.jpg ",0)
h,w = X.shape
N = h*w
Also, prepare a process to display the image as a show function.
def show(img):
img = (img/np.max(img))*255
img = img.astype("uint8")
cv2.imshow("img", img)
cv2.waitKey(0)
Let's display the loaded image
show(X)
This time, we will use a 16x16 pixel cross image.
Next, prepare a mask. This time, it is assumed that M times are measured with a random black-and-white mask with N pixels (M = N).
M = N
W = np.random.rand(M, N)
W = np.where(W>0.5, 1, 0)
show(W)
print(W.shape)
Did the image appear and the print result 256,256
?
This means measuring 256 times with a 256-pixel mask.
Let's also check the mask used in the first measurement
show(W[0].reshape(h,w))
Y = []
for mask in W[:M]:
#i-th measurement
yi = np.dot(mask.reshape(1,N), X.reshape(N,1))
Y.append(yi)
"""
Same meaning
Y = np.dot(W,X.reshape(N,1))
"""
Y = np.array(Y).reshape(M,1)
print(Y.shape)
Is it displayed as (256, 1)? This means that there are 256 measurement results. with this
pinv
of numpy
to calculate the inverse matrix
InvW = np.linalg.pinv(W[:M])
print(InvW.shape)
Now, use these ʻInvW and
Y` to get and display X. Note that X is reconstructed in the state of (N, 1), so if you want to display it as an image, you need to reshape it using (h, w).
rec = np.dot(InvW, Y)
print(rec.shape)
show(rec.reshape(h,w))
If you can get the original image safely, you are successful. The above is the flow of a single pixel camera. You can see that it can be implemented with insanely simple code. In an actual experiment, the mask used for W and the detector detection value for Y can be used to reconstruct the image target X in the same way as above.
By the way, if you reduce the number of measurements M here, you will not be able to acquire images well. This is because the number of measurements usually needs to be M = N. However, using a technology called ** compressed sensing **, this number of measurements M can be significantly reduced. Usually, the number of measurements is reduced by combining this technology.
I will not touch on compressed sensing because it will be long if I explain it, but if you are interested, please implement it. This commentary is easy to understand. http://www-adsys.sys.i.kyoto-u.ac.jp/mohzeki/Presentation/lecturenote20160822.pdf
Thank you for your continued support of DeNA 20 New Graduate Advent Calendar 2019. Then: D
Recommended Posts