I made a code to convert illustration2vec to keras model

Overview

https://github.com/rezoo/illustration2vec illustration2vec is a model that can detect the features and tags of illustrations. The structure of the model is almost VGG model. There are some changes from the original VGG, so for details, see the review article at the above link. It is a model of caffe and chainer. It's an interesting model, but chainer seems to be finished development, so I definitely wanted to reuse it and wrote a conversion code to keras. I've never used torch, so let's go through.

The execution was done by google colaboratory, so link below. https://colab.research.google.com/drive/1UZN7pn4UzU5s501dwSIA2IHGmjnAmouY

If you can't open the link, copy the following to colab and it will work.

All code
!git clone https://github.com/rezoo/illustration2vec.git
!sh illustration2vec/get_models.sh
!pip install -r /content/illustration2vec/requirements.txt
!mv /content/illustration2vec/i2v /content/
import i2v
illust2vec_tag = i2v.make_i2v_with_chainer('/content/illustration2vec/illust2vec_tag_ver200.caffemodel', '/content/illustration2vec/tag_list.json')
%tensorflow_version 2.x
import tensorflow as tf
import numpy as np

#tag estimater model
model_tag = tf.keras.Sequential(name='illustration2vec_tag')
model_tag.add(tf.keras.layers.Input(shape=(224, 224, 3)))
pool_idx = [0, 1, 3, 5, 7]
for i, chainer_layer in enumerate(illust2vec_tag.net.children()):
    kernel, bias = tuple(chainer_layer.params())
    k_kernel = np.transpose(kernel.data, axes=[3, 2, 1, 0])
    bias = bias.data
    if i == 0:
        k_kernel = k_kernel[:,:,::-1,:]
    channel = bias.shape[0]
    keras_layer = tf.keras.layers.Conv2D(channel, 3, padding='SAME', activation='relu', kernel_initializer=tf.keras.initializers.constant(k_kernel), bias_initializer=tf.keras.initializers.constant(bias), name='Conv_%d'%i)
    model_tag.add(keras_layer)
    if i in pool_idx:
        model_tag.add(tf.keras.layers.MaxPooling2D())
    del kernel, bias
model_tag.add(tf.keras.layers.AveragePooling2D(pool_size=(7, 7)))
model_tag.add(tf.keras.layers.Lambda(lambda x : tf.nn.sigmoid(tf.squeeze(x, axis=[1, 2])), name='sigmoid'))
model_tag.save('illust2vec_tag_ver200.h5')
del model_tag, illust2vec_tag

#feature vector model
illust2vec = i2v.make_i2v_with_chainer('/content/illustration2vec/illust2vec_ver200.caffemodel')
model = tf.keras.Sequential(name='illustration2vec')
model.add(tf.keras.layers.Input(shape=(224, 224, 3)))
pool_idx = [0, 1, 3, 5, 7]
for i, chainer_layer in enumerate(illust2vec.net.children()):
    if i == 12:
        break
    kernel, bias = tuple(chainer_layer.params())
    if len(kernel.data.shape) == 4:
        k_kernel = np.transpose(kernel.data, axes=[3, 2, 1, 0])
        bias = bias.data
        if i == 0:
            k_kernel = k_kernel[:,:,::-1,:]
        channel = bias.shape[0]
        keras_layer = tf.keras.layers.Conv2D(channel, 3, padding='SAME', activation='relu', kernel_initializer=tf.keras.initializers.constant(k_kernel), bias_initializer=tf.keras.initializers.constant(bias), name='Conv_%d'%i)
        model.add(keras_layer)
        if i in pool_idx:
            model.add(tf.keras.layers.MaxPooling2D())
        elif i == 10:
            model.add(tf.keras.layers.Flatten())
    elif len(kernel.data.shape) == 2:
        model.add(tf.keras.layers.Dense(4096, kernel_initializer=tf.keras.initializers.constant(kernel.data), bias_initializer=tf.keras.initializers.constant(bias.data), name='encode1'))
    del kernel, bias
model.save('illust2vec_ver200.h5')
del model, illust2vec
def resize(imgs):
    mean = tf.constant(np.array([181.13838569, 167.47864617, 164.76139251]).reshape((1, 1, 3)), dtype=tf.float32)
    resized = []
    for img in imgs:
        img = tf.cast(img, tf.float32)
        im_max = tf.reduce_max(img, keepdims=True)
        im_min = tf.reduce_min(img, keepdims=True)
        im_std = (img - im_min) / (im_max - im_min + 1e-10)
        resized_std = tf.image.resize(im_std, (224, 224))
        resized_im = resized_std*(im_max - im_min) + im_min
        resized_im = resized_im - mean
        resized.append(tf.expand_dims(resized_im, 0))
    return tf.concat(resized, 0)

Commentary

All I'm doing is defining a model with the same structure in keras and initializing it with the weight of the chainer model. There are three changes as follows. ・ Weight transpose The kernel of the convolution of this chainer model is (out_channel, in_channel, k_size, k_size), but it is transposed to (k_size, k_size, in_channel, out_channel) for the keras model. -Change the input image from BGR to RGB In the original model, the input image is BGR, but I changed it to RGB input. Therefore, the convolution kernel of the first layer is reversed on the axis of in_channel. -Change the input image to channel_last The chainer convolution has an input of (N, C, H, W), but is the default of keras (N, H, W, C).

The input of the converted model is the size of (batch, 224, 224, 3). If you pass the list of numpy.array of the image to the above colab link or the resize function in the whole code, it will resize + normalize. There are two models, illust2vec_tag_ver200.h5 and illust2vec_ver200.h5, but the first output is the tag_list at https://github.com/rezoo/illustration2vec. This is the probability of each of the 1539 tags in .json. The output of the second model is a feature vector of the image.

Please point out any deficiencies.

Recommended Posts

I made a code to convert illustration2vec to keras model
I made a tool to convert Jupyter py to ipynb with VS Code
I made a function to check the model of DCGAN
I made a script to display emoji
I wrote a code to convert quaternions to z-y-x Euler angles in Python
I made a CLI tool to convert images in each directory to PDF
I made a network to convert black and white images to color images (pix2pix)
I made a script in python to convert .md files to Scrapbox format
I want to easily create a Noise Model
I made a tool to compile Hy natively
I made a tool to get new articles
I made a QR code image with CuteR
〇✕ I made a game
I made a package to create an executable file from Hy source code
I made a library to separate Japanese sentences nicely
I made a script to put a snippet in README.md
I made a Python module to translate comment outs
I made a command to markdown the table clipboard
I made a python library to do rolling rank
I made an action to automatically format python code
I made a program to convert images into ASCII art with Python and OpenCV
I made a python text
I made a discord bot
Convert python 3.x code to python 2.x
I made a package to filter time series with python
I made a box to rest before Pepper gets tired
I made a command to generate a table comment in Django
I made a tool to create a word cloud from wikipedia
I made a VGG16 model using TensorFlow (on the way)
[Titan Craft] I made a tool to summon a giant to Minecraft
I tried to divide with a deep learning language model
I made you to execute a command from a web browser
I made a script to say hello at my Koshien
I made a script in Python to convert a text file for JSON (for vscode user snippet)
I made a C ++ learning site
I made a program to solve (hint) Saizeriya's spot the difference
I made a library to easily read config files with Python
I tried to convert a Python file to EXE (Recursion error supported)
I made a Line-bot using Python!
I made a CUI-based translation script (2)
Convert A4 PDF to A3 every 2 pages
I made a wikipedia gacha bot
Python> I made a test code for my own external file
I made a web server with Raspberry Pi to watch anime
I wanted to convert my face photo into a Yuyushiki style.
numpy: I want to convert a single type ndarray to a structured array
I tried to implement anomaly detection using a hidden Markov model
Code to randomly generate a score
I made a fortune with Python.
I made a CUI-based translation script
How to convert Tensorflow model to Lite
Convert a string to an image
I want to make a parameter list from CloudFormation code (yaml)
I made a library that adds docstring to a Python stub file.
A tool to convert Juniper config
I made a command to display a colorful calendar in the terminal
I made a daemon with Python
I made a Docker container to use JUMAN ++, KNP, python (for pyKNP).
I made a plugin to generate Markdown table from csv in Vim
[Python] I made a decorator that doesn't seem to have any use.
I made a password generator to teach Python3 to children (bonus) * Completely remade