Image quote (left): Sumire Uesaka Official Blog Nekomori Rally
PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization
Roughly speaking,
** A machine learning model that generates a 3D model of a person with clothes from a single image **
is.
We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way.
Introduce the Pixel-aligned Implicit Function (PIFu). This is a very effective implicit representation that aligns the pixels of a 2D image locally in the global context of the corresponding 3D object. We propose an end-to-end deep learning method for digitizing highly detailed clothing that can infer both 3D surfaces and textures from a single image and optionally multiple input images using PIFu. .. Very complex shapes such as hairstyles and clothing, as well as their variations and variants, can be digitized in a unified way.
The installation method is simple.
$ git clone https://github.com/shunsukesaito/PIFu.git
$ cd PIFu
$ pip install -r requirements.txt
$ sh ./scripts/download_trained_model.sh
PIFu comes with a sample dataset that you can easily get it working with.
$ sh ./scripts/test.sh
Doing so will output the file results / pifu_demo / result_ryota.obj
.
MeshLab is recommended when viewing 3D models. The reason is that the model output by PIFu has no texture and is colored by VertexColor. It is recommended because there are few viewers who can see the model colored with this Vertex Color and it is easy to see.
There are two things you need to do to generate a 3D model in PIFu.
This time, from the free material Pakutaso (www.pakutaso.com), Free image (photo) of the yukata glasses boy (whole body) who puts his hand on the sleeve .html) is used.
Since the original image is vertically long, add a band to make it a square image. Let's call this kimono.png
.
Then generate a mask image. Let's call this kimono_mask.png
.
** The name is important here. Be sure to add _mask
to the mask image. ** **
Then create a kimono /
folder and copy the two files.
mkdir kimono/
cp kimono.png kimono/
cp kimono_mask.png kimono/
Create the following content as scripts / eval.sh
.
scripts/eval.sh
#!/usr/bin/env bash
set -ex
# Training
GPU_ID=0
DISPLAY_ID=$((GPU_ID*10+10))
NAME='pifu_demo'
# Network configuration
BATCH_SIZE=1
MLP_DIM='257 1024 512 256 128 1'
MLP_DIM_COLOR='513 1024 512 256 128 3'
TEST_FOLDER_PATH=$1
shift
# Reconstruction resolution
# NOTE: one can change here to reconstruct mesh in a different resolution.
VOL_RES=$1
shift
CHECKPOINTS_NETG_PATH='./checkpoints/net_G'
CHECKPOINTS_NETC_PATH='./checkpoints/net_C'
# command
CUDA_VISIBLE_DEVICES=${GPU_ID} python ./apps/eval.py \
--name ${NAME} \
--batch_size ${BATCH_SIZE} \
--mlp_dim ${MLP_DIM} \
--mlp_dim_color ${MLP_DIM_COLOR} \
--num_stack 4 \
--num_hourglass 2 \
--resolution ${VOL_RES} \
--hg_down 'ave_pool' \
--norm 'group' \
--norm_color 'group' \
--test_folder_path ${TEST_FOLDER_PATH} \
--load_netG_checkpoint_path ${CHECKPOINTS_NETG_PATH} \
--load_netC_checkpoint_path ${CHECKPOINTS_NETC_PATH}
Finally,
$ sh scripts/eval_default.sh kimono/ 256
By doing so, results / pifu_demo / result_kimono.obj
will be generated.
There is a method called PIFu. This is a PIFu ** that I made to create high quality textures. (I just named it to distinguish it from the original family.) It's just a way out, and there are some things that are a little strange. Well, there are various circumstances, so I will explain it later.
Left: Original image Medium: PIFu default Right: PIFu
It's a branch called 2_phase_generate in my PIFu repository.
https://github.com/kotauchisunsun/PIFu/tree/2_phase_generate
In this branch, you can output with scripts / eval_two_phase.sh
.
As for how to use
./scripts/eval_two_phase.sh IMAGE_DIR/ VOXEL_RESOLUTION VOXEL_LOAD_SIZE TEX_LOAD_SIZE
It's like that. IMAGE_DIR / is the directory containing the images. VOXEL_RESOLUTION is recommended around 512,1024. If it is 1024, it will bring about 20GB of memory, so match that area to the machine. It is recommended to fix VOXEL_LOAD_SIZE to 512. Set TEX_LOAD_SIZE to 1024 or 2048 according to the resolution of the texture. This is a good idea to get a model with high quality texture.
So, which area is illegal? It is a story. Well, it looks like ** using non-regular behavior **. For details, see Pull Request, but originally VOXEL_LOAD_SIZE and TEX_LOAD_SIZE should not be specified except 512. about it. However, when I set TEX_LOAD_SIZE to 1024 and output it, it is troublesome that ** a beautiful model has been created **. At first, I thought, "If I put an invalid value in TEX_LOAD_SIZE, I would die" or "If it moves, the texture will be shattered", so I modified it appropriately, but it looks like that. It came out beautifully. It has come out. So, I made a pull request, but it seems that it was originally useless. As a matter of fact, the texture behind is rather shredded. Left: PIFu Right: PIFu
As the author said, if you want a high quality texture, why not simply project it? It is said that it may be so. Actually, PIFu also has a function to project textures, but it feels like I saw the code, and it is impossible to output in high resolution, so I think that modification is essential.
I am happy that I was able to introduce Sumire. PIFu has known its existence since last year and wondered when the code would be released, but I was surprised that it came out unexpectedly early. Also, it was relatively easy to move, so I'm glad I was able to do it quickly. However, I'm wondering if I can make it feel a little better. Sonic Boom Sonic Boom Wesaka Kawaii.
Recommended Posts