Introduction

** DeNA 20 New graduate Advent Calendar 2019 article. ** **

I will be a web engineer at DeNA from April next year, but I usually do research on light at graduate school. When I briefly introduce my research, I change the shape of the laser and shoot it, or use light to measure objects (By the way, when I display my Qiita account image on something called SLM and hit the laser, The laser branches into 100).

In this article, since I am studying light, I would like to introduce a ** "single pixel camera" ** that can be easily implemented using Python (anything can be implemented if matrix calculation is possible). So you can easily implement it in MATLAB or Octave)

Single pixel camera (single pixel camera)

Ordinary camera

Before talking about the principle of a single pixel camera, let's start with the mechanism of a normal camera. To put it bluntly, a normal camera uses a CCD, which is lined with a lot of ** detectors (which can detect the intensity of light) **.

For example, if you have a camera that can shoot 100 x 100 pixel images, 100 x 100 detectors are lined up. Then, the light reflected from the object is detected by each detector, and the intensity of the detected light is converted into a pixel value to create an image.

Principle of single pixel camera

Then, what about a single-pixel camera? Only one detector can take an image. To say a little more ** "Camera (method) that can reconstruct an image by matrix calculation after measuring multiple times using one detector" ** (It's called a single pixel because it can be shot with one detector).

I think that most people don't get the actual image even if they listen to this much. It's okay. The mechanism of the one-pixel camera itself is really simple. ~~ The reason why I chose it as the material for the article seems to be easy to explain. .. .. ~~

The figure below shows the actual shooting with a single pixel camera. Here, Scene is the subject of photography. The lens is used to collect the light in the detector. As shown in the above figure, all the light coming from the object passes through the mask and is collected and detected by one detector.

The mask pattern that suddenly appeared here. This is important. Thanks to this guy, the one-pixel camera can shoot with only one detector.

** "Detect the light intensity after the light coming from the object passes through the mask pattern" (the order of the mask and the shooting target does not matter). ** **

By changing the mask pattern prepared in advance one after another and repeating the above work, shooting with a single pixel camera is performed.

Here, the state of the i-th light intensity detection Assuming that the mask pattern (encoding matrix) is Wi, the shooting target is X, and the detection value is Yi, Wi can be transformed into (1, N) and X can be transformed into (N, 1) as shown below.

内積説明.jpg

It looks like. Here, N is the number of pixels of the mask, and this N is the number of pixels to be photographed as it is.

As shown in the above figure, the i-th measurement result Yi is the inner product of Wi and X. $ Wi · x = Yi $ Can be expressed as. The result of performing this measurement M times can be expressed as the matrix product $ (M, N) × (N, 1) = (M, 1) $ (usually N =) as shown in the following equation. When M, the target can be completely reconstructed).

行列.jpg

One line of W and Y is the mask and measurement result used in each measurement time, respectively. Now X, W, Y

X: Observation target: What you want to take now. I don't know what it is
W: Encoded matrix ... I know it because I prepared it myself
Y: Detected value ・・・ I know it because it is a detected value

It is in the state. And Y $ WX=Y $ It can be expressed by multiplying the matrix of W and X. How can I find out what X looks like from here? ?? Yes, it's easy. All you have to do is multiply both sides by the inverse matrix $ W ^ {-1} $ of W. Then $ X = W^{-1}Y $ I was able to get X next to it.

This is the shooting principle of a single pixel camera.

merit and demerit

One of the merits of using a single-pixel camera for this kind of work is that you can shoot even with extremely weak light **.

The reason why it is possible to shoot even with extremely weak light is that a single pixel camera ** collects and detects light in one detector **.

For example, if $ 10 $ of light is coming from the subject, a 10x10 pixel camera has a separate detector for each pixel, so the amount of light that can be detected by each detector is $ 10 / (10x10) $. However, since the one-pixel camera collects the light in one detector and detects it, the amount of light remains at 10 (although it actually passes through the mask, so it drops to 4 or 5).

Therefore, it is possible to shoot even with an amount of light that cannot be shot with a normal camera.

The disadvantage is that it takes multiple measurements to take a single image. Therefore, it is not suitable for capturing "a moment" or for shooting movies.

Simulation in Python

Now, let's finally implement the flow of a single pixel camera with Pyhon. Well, even if you say implementation, as mentioned in the principle, you are only doing matrix calculation, so you just let Python do the calculation.

First, import the library to be used. Use numpy to handle matrices and ʻopenCV` to handle images.

import numpy as np
import cv2

After that, the image to be shot is read. Also, obtain the vertical and horizontal sizes of the image, and calculate the number of pixels N of the mask from it.

X = cv2.imread("img/target.jpg ",0)
h,w = X.shape
N = h*w

Also, prepare a process to display the image as a show function.

def show(img):
    img = (img/np.max(img))*255
    img = img.astype("uint8")
    cv2.imshow("img", img)
    cv2.waitKey(0)

Let's display the loaded image

show(X)

This time, we will use a 16x16 pixel cross image.

Next, prepare a mask. This time, it is assumed that M times are measured with a random black-and-white mask with N pixels (M = N).

M = N
W = np.random.rand(M, N)
W = np.where(W>0.5, 1, 0)
show(W)
print(W.shape)

Did the image appear and the print result 256,256? This means measuring 256 times with a 256-pixel mask.

Let's also check the mask used in the first measurement

show(W[0].reshape(h,w))

Next, let's get the measurement result Y. Take out the mask sequentially from W, take the inner product with the shooting target, and add it to Y. If you calculate with `Y = np.dot (W, X.reshape (N, 1))`, it will be one shot, but this time we will take the value once and once like the actual measurement. At this time, don't forget to change the shape of X from (h, w) to (N, 1).

Y = []
for mask in W[:M]:
    #i-th measurement
    yi = np.dot(mask.reshape(1,N), X.reshape(N,1))
    Y.append(yi)
"""
Same meaning
Y = np.dot(W,X.reshape(N,1))
"""

Y = np.array(Y).reshape(M,1)
print(Y.shape)

Is it displayed as (256, 1)? This means that there are 256 measurement results. with this $WX = Y$ Is completed, and the measurement is completed successfully. Next, let's reconstruct X using W and Y. $ X = W^{-1}Y$ To obtain X, we needed $ W ^ {-1} $, which is the inverse matrix of W, as shown in the above equation. So we use pinv of numpy to calculate the inverse matrix

InvW = np.linalg.pinv(W[:M])
print(InvW.shape)

Now, use these ʻInvW and Y` to get and display X. Note that X is reconstructed in the state of (N, 1), so if you want to display it as an image, you need to reshape it using (h, w).

rec = np.dot(InvW, Y)
print(rec.shape)
show(rec.reshape(h,w))

If you can get the original image safely, you are successful. The above is the flow of a single pixel camera. You can see that it can be implemented with insanely simple code. In an actual experiment, the mask used for W and the detector detection value for Y can be used to reconstruct the image target X in the same way as above.

By the way, if you reduce the number of measurements M here, you will not be able to acquire images well. This is because the number of measurements usually needs to be M = N. However, using a technology called ** compressed sensing **, this number of measurements M can be significantly reduced. Usually, the number of measurements is reduced by combining this technology.

I will not touch on compressed sensing because it will be long if I explain it, but if you are interested, please implement it. This commentary is easy to understand. http://www-adsys.sys.i.kyoto-u.ac.jp/mohzeki/Presentation/lecturenote20160822.pdf

This is the end of my article.

Thank you for your continued support of DeNA 20 New Graduate Advent Calendar 2019. Then: D