Introduction

First from things Here is it

Improved version of PTAM mentioned earlier
Qiita Rewrite everything https://t.co/5DbJhH49vj pic.twitter.com/1tkkSQDsVH
& mdash Yodaka no Moyashi (@night_moyashi) January 19, 2021

As you can see, it recognizes 3D space from the image.

algorithm

Simply put, use ** vector difference ** What does that mean?

First, find the feature points on the image and match the two images. Then, assuming that the image plane is in real space, find the distance between the center of the image plane and the camera.

Then, from the coordinates of the feature point with the center of the image as the origin and the distance between the origin and the camera obtained earlier, the vector with the camera as the origin is derived from each of the two images.

This is the preparation.

Of the vectors derived in the process up to this point, the vector derived from the first image

\vec{a}=(a_1,a_2,a_3)

The vector derived from the second image

\vec{b}=(b_1,b_2,b_3)

The vector that the camera moved between these two images

\vec{c}=(c_1,c_2,c_3)

Then, the intersection $ O $ of $ \ vec {a} $ and $ \ vec {b} $ and the respective shooting points $ A (\ vec {a}) and B (\ vec {b}) $ are on the same plane. Because it's on top

-l\vec{a}=\vec{c}-k\vec{b}\\
k\vec{b}-l\vec{a}=\vec{c}

Will be From this formula

\left\{
\begin{array}{l}
c_1=kb_1-la_1\\
c_2=kb_2-la_2\\
c_3=kb_3-la_3
\end{array}
\right.

The simultaneous equations are established, but if you try to solve using a matrix as it is,

\begin{vmatrix}
b_{1}&-a_{1}\\
b_{2}&-a_{2}\\
b_{3}&-a_{3}\\
\end{vmatrix}
\begin{vmatrix}
k\\
l\\
\end{vmatrix}
=
\begin{vmatrix}
c_{1}\\
c_{2}\\
c_{3}
\end{vmatrix}\\
\begin{vmatrix}
k\\
l\\
\end{vmatrix}
=
\begin{vmatrix}
c_{1}\\
c_{2}\\
c_{3}
\end{vmatrix}
\begin{vmatrix}
b_{1}&-a_{1}\\
b_{2}&-a_{2}\\
b_{3}&-a_{3}\\
\end{vmatrix}^{-1}

And

\begin{vmatrix}
b_{1}&-a_{1}\\
b_{2}&-a_{2}\\
b_{3}&-a_{3}\\
\end{vmatrix}

Is not an invertible matrix, so the equation cannot be solved. Therefore, the simultaneous equations are transformed.

\left\{
\begin{array}{l}
(c_1+c_2)=k(b_1+b_2)-l(a_1+a_2)\\
(c_2+c_3)=k(b_2+b_3)-l(a_2+a_3)\\

\end{array}
\right.

As a result, $ k and l $ are derived using a matrix.

And $ k \ vec {b} $ is the distance from $ B $.

program

The program is available here GitHub For the time being, the core part of the code

`match.py`


def convert(data):
    data_converted=[]
    for loc in data:
        data_converted.append(convert_3d(loc))
    A=np.matrix([
        [-1*sum(data_converted[1][:2]),sum(data_converted[0][:2])],
        [-1*sum(data_converted[1][1:]),sum(data_converted[0][1:])],
    ])
    Y=np.matrix([
        [-0.6],
        [0],
    ])
    coe = np.linalg.solve(A,Y).reshape(-1,).tolist()
    
    return data_converted[1][0]*coe[0][0],data_converted[1][1]*coe[0][0],data_converted[1][2]*coe[0][0]

problem

As long as you are the person who made it, you have to raise the problem. As you can see in the tweet at the beginning, outliers are inevitable. It seems that the accuracy of $ \ vec {c} $ has a strong influence on this outlier.

The reason why such a problem occurs is because it is assumed that they are in the same plane **. In other words, it seems that outliers will appear if $ \ vec {c} $ shifts even a little.

I have no idea how to correct this. I will add it as soon as it is corrected.

Well, it was a story like this. Well then

A story about making 3D space recognition with Python

Introduction

algorithm

program

match.py

problem

`match.py`