I tried to move the 3D model by doing something like motion capture with just a laptop + webcam

Overview

This article If you use a posture estimation model that is reasonably light in processing, you can do motion capture with just a laptop and a webcam (built-in camera). I think I can do it, so I tried it. The flow is almost the same as the original article.

Things necessary

procedure

Python side

1. 1. Preparation of posture estimation model

Please clone the following https://github.com/ildoonet/tf-pose-estimation/tree/master By using this, the posture of the person in the 2D image can be estimated.

2. Restoration of 3D information

In order to do something like motion capture, 3D information is needed.

    1. The model of is only able to obtain two-dimensional information. Therefore, use the process in the develop branch of the repository to get the 3D information. (It looks like it was originally in master, but it's gone)

Please clone the following https://github.com/ildoonet/tf-pose-estimation/tree/devel Then move the devel / src / lifting folder to master

3. Preparation of WebSocket server

In this system, processing such as posture estimation of a person is performed on the Python side, and only the 3D model is displayed on the Unity side. The information communication part of Python and Unity will be implemented using WebSocket this time.

Execute the following command pip install git+https://github.com/Pithikos/python-websocket-server

server.py


import logging

import cv2
import json
import numpy as np
import common

from tf_pose.estimator import TfPoseEstimator
from tf_pose.networks import get_graph_path, model_wh
from websocket_server import WebsocketServer
from lifting.prob_model import Prob3dPose

PORT = 5000
HOST = '127.0.0.1'

# logger_setup
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(' %(module)s -  %(asctime)s - %(levelname)s - %(message)s'))
logger.addHandler(handler)


def create_json(pose3d):
    global old_data

    data = {'body_parts': []}

    """
    // 0 :Hip
    // 1 :RHip
    // 2 :RKnee
    // 3 :RFoot
    // 4 :LHip
    // 5 :LKnee
    // 6 :LFoot
    // 7 :Spine
    // 8 :Thorax
    // 9 :Neck/Nose
    // 10:Head
    // 11:LShoulder
    // 12:LElbow
    // 13:LWrist
    // 14:RShoulder
    // 15:RElbow
    // 16:RWrist
    """

    for i in range(17):
        data['body_parts'].append({'id': i, 'x': pose3d[0][0][i], 'y': pose3d[0][2][i], 'z': pose3d[0][1][i]})

    old_data = data
    return data


def new_client(client, server):
    logger.info('NewClient {}:{} has left.'.format(client['address'][0], client['address'][1]))


def client_left(client, server):
    logger.info('Client {}:{} has left.'.format(client['address'][0], client['address'][1]))


def message_received(client, server, message):
    _, image = cam.read()

    humans = e.inference(image, resize_to_default=(w > 0 and h > 0), upsample_size=4.0)

    pose_2d_mpiis = []
    visibilities = []

    standard_w = 640
    standard_h = 480

    try:
        pose_2d_mpii, visibility = common.MPIIPart.from_coco(humans[0])
        pose_2d_mpiis.append([(int(x * standard_w + 0.5), int(y * standard_h + 0.5)) for x, y in pose_2d_mpii])
        visibilities.append(visibility)
        pose_2d_mpiis = np.array(pose_2d_mpiis)
        visibilities = np.array(visibilities)
        transformed_pose2d, weights = poseLifting.transform_joints(pose_2d_mpiis, visibilities)
        pose_3d = poseLifting.compute_3d(transformed_pose2d, weights)
        print(pose_3d)
        server.send_message(client, json.dumps(create_json(pose_3d)))

    except :
        server.send_message(client, json.dumps(old_data))


if __name__ == '__main__':
    # main
    w, h = model_wh("432x368")
    e = TfPoseEstimator(get_graph_path("mobilenet_thin"), target_size=(432, 368), trt_bool=False)
    poseLifting = Prob3dPose('lifting/models/prob_model_params.mat')

    cam = cv2.VideoCapture(0)

    old_data = {}

    server = WebsocketServer(port=PORT, host=HOST)
    server.set_fn_new_client(new_client)
    server.set_fn_client_left(client_left)
    server.set_fn_message_received(message_received)
    server.run_forever()

Now the Python side is ready

Unity side

1. Preparation of 3D model

First of all, let's prepare the 3D model you want to move. This time, I used the "Unity-Chan!" Model from the Asset Store.

2. Installation of required libraries

Clone SAFullBodyIK and move it to the Assets folder. Also, clone and build https://github.com/sta/websocket-sharp so that WebSocket can be received. The following is easy to understand how to build https://qiita.com/oishihiroaki/items/bb2977c72052f5dd5bd9

2. Unity side code

I borrowed the code of Reference source of this article.

IKSetting.cs


using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
using UnityEngine;
using WebSocketSharp;
using WebSocketSharp.Net;

public class IKSetting : MonoBehaviour {
    private BodyParts bodyParts;
    private string receivedJson;
    private WebSocket ws;

    [SerializeField, Range(10, 120)]
    float FrameRate;
    public List<Transform> BoneList = new List<Transform>();
    GameObject FullbodyIK;
    Vector3[] points = new Vector3[17];
    Vector3[] NormalizeBone = new Vector3[12];
    float[] BoneDistance = new float[12];
    float Timer;
    int[,] joints = new int[,] { { 0, 1 }, { 1, 2 }, { 2, 3 }, { 0, 4 }, { 4, 5 }, { 5, 6 }, { 0, 7 }, { 7, 8 }, { 8, 9 }, { 9, 10 }, { 8, 11 }, { 11, 12 }, { 12, 13 }, { 8, 14 }, { 14, 15 }, { 15, 16 } };
    int[,] BoneJoint = new int[,] { { 0, 2 }, { 2, 3 }, { 0, 5 }, { 5, 6 }, { 0, 9 }, { 9, 10 }, { 9, 11 }, { 11, 12 }, { 12, 13 }, { 9, 14 }, { 14, 15 }, { 15, 16 } };
    int[,] NormalizeJoint = new int[,] { { 0, 1 }, { 1, 2 }, { 0, 3 }, { 3, 4 }, { 0, 5 }, { 5, 6 }, { 5, 7 }, { 7, 8 }, { 8, 9 }, { 5, 10 }, { 10, 11 }, { 11, 12 } };
    int NowFrame = 0;

    float[] x = new float[17];
    float[] y = new float[17];
    float[] z = new float[17];

    bool isReceived = false;

    // Use this for initialization
    void Start () {

        ws = new WebSocket("ws://localhost:5000/");
        ws.OnOpen += (sender, e) =>
        {
            Debug.Log("WebSocket Open");
        };
        ws.OnMessage += (sender, e) =>
        {
            receivedJson = e.Data;
            Debug.Log("Data: " + e.Data);
            isReceived = true;
        };
        ws.OnError += (sender, e) =>
        {
            Debug.Log("WebSocket Error Message: " + e.Message);
        };
        ws.OnClose += (sender, e) =>
        {
            Debug.Log("WebSocket Close");
        };
        ws.Connect();

        ws.Send("");
    }

    // Update is called once per frame
    void Update () {

        Timer += Time.deltaTime;
        ws.Send("");

        if (Timer > (1 / FrameRate))
        {
            Timer = 0;
            PointUpdate();
        }
        if (!FullbodyIK)
        {
            IKFind();
        }
        else
        {
            IKSet();
        }
    }

    void OnDestroy()
    {
        ws.Close();
        ws = null;
    }

    void PointUpdate()
    {
        if (NowFrame < 600)
        {
            NowFrame++;
            if (isReceived)
            {
                bodyParts = JsonUtility.FromJson<BodyParts>(receivedJson);
                for (int i = 0; i < 17; i++)
                {
                    x[i] = bodyParts.body_parts[i].x;
                    y[i] = bodyParts.body_parts[i].y;
                    z[i] = bodyParts.body_parts[i].z;
                }

                isReceived = false;
            }

            for (int i = 0; i < 17; i++)
            {
                points[i] = new Vector3(x[i], y[i], -z[i]);
                Debug.Log(points[i]);
            }

            for (int i = 0; i < 12; i++)
            {
                NormalizeBone[i] = (points[BoneJoint[i, 1]] - points[BoneJoint[i, 0]]).normalized;
            }
        }
    }

    void IKFind()
    {
        FullbodyIK = GameObject.Find("FullBodyIK");
        if (FullbodyIK)
        {
            for (int i = 0; i < Enum.GetNames(typeof(OpenPoseRef)).Length; i++)
            {
                Transform obj = GameObject.Find(Enum.GetName(typeof(OpenPoseRef), i)).transform;
                if (obj)
                {
                    BoneList.Add(obj);
                }
            }
            for (int i = 0; i < Enum.GetNames(typeof(NormalizeBoneRef)).Length; i++)
            {
                BoneDistance[i] = Vector3.Distance(BoneList[NormalizeJoint[i, 0]].position, BoneList[NormalizeJoint[i, 1]].position);
            }
        }
    }

    void IKSet()
    {
        if (Math.Abs(points[0].x) < 1000 && Math.Abs(points[0].y) < 1000 && Math.Abs(points[0].z) < 1000)
        {
            BoneList[0].position = points[0] * 0.001f + Vector3.up * 0.8f;
        }
        for (int i = 0; i < 12; i++)
        {
            BoneList[NormalizeJoint[i, 1]].position = Vector3.Lerp(
                BoneList[NormalizeJoint[i, 1]].position,
                BoneList[NormalizeJoint[i, 0]].position + BoneDistance[i] * NormalizeBone[i]
                , 0.05f
            );
            DrawLine(BoneList[NormalizeJoint[i, 0]].position, BoneList[NormalizeJoint[i, 1]].position, Color.red);
        }
        for (int i = 0; i < joints.Length / 2; i++)
        {
            DrawLine(points[joints[i, 0]] * 0.001f + new Vector3(-1, 0.8f, 0), points[joints[i, 1]] * 0.001f + new Vector3(-1, 0.8f, 0), Color.blue);
        }
    }

    void DrawLine(Vector3 s, Vector3 e, Color c)
    {
        Debug.DrawLine(s, e, c);
    }
}

enum OpenPoseRef
{
    Hips,
    LeftKnee, LeftFoot,
    RightKnee, RightFoot,
    Neck, Head,
    RightArm, RightElbow, RightWrist,
    LeftArm, LeftElbow, LeftWrist
};

enum NormalizeBoneRef
{
    Hip2LeftKnee, LeftKnee2LeftFoot,
    Hip2RightKnee, RightKnee2RightFoot,
    Hip2Neck, Neck2Head,
    Neck2RightArm, RightArm2RightElbow, RightElbow2RightWrist,
    Neck2LeftArm, LeftArm2LeftElbow, LeftElbow2LeftWrist
};

[System.Serializable]
public class BodyParts
{
    public Position[] body_parts;
}

[System.Serializable]
public class Position
{
    public int id;
    public float x;
    public float y;
    public float z;
}

That's all for preparation.

Run

After executing server.py, press the Play button on the Unity side.

Summary

It moved faster than expected, although there was some lag. If you look for a lighter model, you can find it, so there seems to be some prediction of improvement.

Recommended Posts

I tried to move the 3D model by doing something like motion capture with just a laptop + webcam
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
I tried to create a model with the sample of Amazon SageMaker Autopilot
I tried to easily create a high-precision 3D image with one photo [0]. (Confirmed how to capture the space, put a net)
I tried to divide with a deep learning language model
I tried to move the ball
I wanted to solve the ABC164 A ~ D problem with Python
I tried to predict the number of domestically infected people of the new corona with a mathematical model
Python: I tried to make a flat / flat_map just right with a generator
I tried to communicate with a remote server by Socket communication with Python.
I came up with a way to make a 3D model from a photo.
765 I tried to identify the three professional families by CNN (with Chainer 2.0.0)
I tried to verify the result of A / B test by chi-square test
I tried to predict the behavior of the new coronavirus with the SEIR model.
#I tried something like Vlookup with Python # 2
I tried to easily create a high-precision 3D image with one photo [-1]. (Is the hidden area really visible?)
I thought I could make a nice gitignore editor, so I tried to make something like MVP for the time being
I tried to move GAN (mnist) with keras
I tried to save the data with discord
I tried to detect motion quickly with OpenCV
I came up with a way to make a 3D model from a photo. 0 Projection to 3D space
I tried to learn the sin function with chainer
I tried to move machine learning (ObjectDetection) with TouchDesigner
I tried to create a table only with Django
I tried to move Faster R-CNN quickly with pytorch
I tried to touch the CSV file with Python
I tried to draw a route map with Python
I tried to solve the soma cube with python
I tried to automatically generate a password with Python3
I tried to solve the problem with Python Vol.1
A super introduction to Django by Python beginners! Part 6 I tried to implement the login function
I came up with a way to create a 3D model from a photo Part 04 Polygon generation
I tried to automatically generate OGP of a blog made with Hugo with tcardgen made by Go
I tried to create a RESTful API by connecting the explosive Python framework FastAPI to MySQL.
[Python] I want to make a 3D scatter plot of the epicenter with Cartopy + Matplotlib!
I made a class to get the analysis result by MeCab in ndarray with python
Day 71 I tried to predict how long this self-restraint will continue with the SIR model
I tried to make a motion detection surveillance camera with OpenCV using a WEB camera with Raspberry Pi
[Shell startup] I tried to display the shell on the TV with a cheap Linux board G-cluster
I also tried to imitate the function monad and State monad with a generator in Python
Create a 2D array by adding a row to the end of an empty array with numpy
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
I tried using PI Fu to generate a 3D model of a person from one image
I tried to predict the sales of game software with VARISTA by referring to the article of Codexa
A person who wants to clear the D problem with ABC of AtCoder tried to scratch
I tried to implement a volume moving average with Quantx
I tried to implement a basic Recurrent Neural Network model
I tried to find the entropy of the image with python
I tried to simulate how the infection spreads with Python
I tried to analyze the whole novel "Weathering with You" ☔️
I tried to automatically create a report with Markov chain
I tried to notify the train delay information with LINE Notify
I tried replacing the Windows 10 HDD with a smaller SSD
I made a function to check the model of DCGAN
I tried to solve a combination optimization problem with Qiskit
I tried to get started with Hy ・ Define a class
I tried to classify MNIST by GNN (with PyTorch geometric)
I tried to sort a random FizzBuzz column with bubble sort.
I tried to divide the file into folders with Python
I tried to implement SSD with PyTorch now (model edition)
I tried to unlock the entrance 2 lock sesame with a single push of the AWS IoT button