Introduction

As you may be familiar with those who are studying deep learning, there is a paper called [A Neural Algorithm of Artistic Style] that converts the style of images. The explanation in Japanese is below. What is style conversion? We recommend that you take a look here first. Preferred Research Algorithm for converting style

If you perform style conversion with this method, the color of the converted image will be similar to the style image, but recently the style conversion is performed while keeping the color [Preserving Color in Neural Artistic Style Transfer](http: / A paper called /arxiv.org/pdf/1606.05897v1.pdf) has been submitted. This is the same paper by the author as A Neural Algorithm of Artistic Style.

What is image style conversion?

Prepare two images, a content image (for example, a cat image) and a style image (for example, a painting drawn by Van Gogh). From these two images, the content is a content image, and the style is similar to a style image (in this case, a cat image like Van Gogh's drawing) is called style conversion.

Overview

The paper proposes two types of methods.

Color histogram mathcing
Bring the color histogram of the style image closer to the content image Perform style conversion using the converted style image
Luminance-only transfer
Style-convert only the brightness of the content image

Method 1 Color histogram matching

Converts the style image so that the mean and covariance of the RGB values are the same as those of the content image. First, define the symbol.

symbol	meaning	dimension
\bf{x}_{C}	Content image	3 x (Content imageのピクセル数), Each line is R, G,B value vector
\bf{x}_{S}	Style image	3 x (Style imageのピクセル数)
\bf{\mu}_{C}	RGB average value of content image	3-vector
\bf{\mu}_{S}	RGB average value of style image	3-vector
\bf{\Sigma}_{C}	RGB covariance of content images	3 x 3
\bf{\Sigma}_{S}	RGB covariance of style images	3 x 3

Style image conversion

\bf{x}_{S^{\prime}} \leftarrow \bf{A}\bf{x}_{S}+\bf{b}

Think about. To make the color histogram of the style conversion after conversion closer to the color histogram of the content image

\bf{\mu}_{S^{\prime}} = \bf{\mu}_{C}

\bf{\Sigma}_{S^{\prime}} = \bf{\Sigma}_{C}

Given $ \ bf {A} $ and $ \ bf {b} $ that satisfy

\bf{b} = \bf{\mu}_{C}-\bf{A}\bf{\mu}_{S}

\bf{A}\bf{\Sigma_{S}}\bf{A^{T}} = \bf{\Sigma_{C}}

Meet. The paper lists two $ \ bf {A} $ that satisfy this formula.

Cholesky decomposition

\bf{A}_{\rm chol} = \bf{L}_{C}\bf{L}_{S}^{-1}

However, $ \ bf {L} $ satisfies $ \ bf {\ Sigma} = \ bf {L} \ bf {L} ^ {T} $ by the Cholesky decomposition of $ \ bf {\ Sigma} $.

Eigenvalue decomposition

Suppose you decompose $ \ bf {\ Sigma} $ into eigenvalues to get $ \ bf {\ Sigma} = \ bf {U} \ bf {\ Lambda} \ bf {U} ^ {T} $. At this time, $ \ bf {\ Sigma} ^ {1/2} = \ bf {U} \ bf {\ Lambda} ^ {1/2} \ bf {U} ^ {T} $, $ \ bf {\ Sigma } ^ {-1 / 2} = \ bf {U} \ bf {\ Lambda} ^ {-1 / 2} \ bf {U} ^ {T} $. And

\bf{A}_{\rm IA} = \bf{\Sigma_{C}^{1/2}}\bf{\Sigma_{S}^{-1/2}}

Is used for conversion.

Use the obtained $ \ bf {A} $ and $ \ bf {b} $ to convert the style image, and then use the content image and the converted style image to perform normal style conversion. The paper states that the eigenvalue decomposition gave better results than the Cholesky decomposition.

I tried to convert the style image using an algorithm that uses eigenvalue decomposition. You can see that the color scheme is similar to the content image.

Content image	Style image	Image with color histogram close

Method 2 Luminance-only transfer

Converts the color space of content and style images to YIQ to get brightness-only content and style images. At this time, the following conversion is performed to make the average and variance of the brightness uniform. $ L_ {C} $ is the brightness of the content image, $ L_ {S} $ is the brightness of the style image, $ \ mu_ {C}, \ mu_ {S} $ is the average brightness of the content image and the style image, respectively. $ \ Sigma_ {C}, \ sigma_ {S} $ are standard deviations.

L_{S^{\prime}} = \frac{\sigma_{C}}{\sigma_{S}}(L_{S}-\mu_{S})+\mu_{C}

After performing style conversion using these images, you can obtain the converted image by combining it with the IQ channel of the content image. How to convert RGB to YIQ can be found in Wikipedia.

Implementation by Python

Color histogram matching

The Python implementation of Color histogram matching using eigenvalue decomposition is as follows. You can do eigenvalue decomposition with numpy.linalg.eig.

import numpy as np
import six

#x is the style image
#y is the style image
# x,y is a numpy array and shape is(batch_size, 3, height, width)
def match_color_histogram(x, y):
    z = np.zeros_like(x)
    shape = x[0].shape
    for i in six.moves.range(len(x)):
        a = x[i].reshape((3, -1))
        a_mean = np.mean(a, axis=1, keepdims=True)
        a_var = np.cov(a)
        d, v = np.linalg.eig(a_var)
        a_sigma_inv = v.dot(np.diag(d ** (-0.5))).dot(v.T)

        b = y[i].reshape((3, -1))
        b_mean = np.mean(b, axis=1, keepdims=True)
        b_var = np.cov(b)
        d, v = np.linalg.eig(b_var)
        b_sigma = v.dot(np.diag(d ** 0.5)).dot(v.T)

        transform = b_sigma.dot(a_sigma_inv)
        z[i,:] = (transform.dot(a - a_mean) + b_mean).reshape(shape)
    return z

BGR to YIQ conversion

The mutual conversion between BGR and YIQ is as follows. BGR is used instead of RGB because the VGG model used for image conversion uses BGR as input.

#x is a numpy array representing an image and shape is(batch_size, 3, height, width)
def bgr_to_yiq(x):
    transform = np.asarray([[0.114, 0.587, 0.299], [-0.322, -0.274, 0.596], [0.312, -0.523, 0.211]], dtype=np.float32)
    n, c, h, w = x.shape
    x = x.transpose((1, 0, 2, 3)).reshape((c, -1))
    x = transform.dot(x)
    return x.reshape((c, n, h, w)).transpose((1, 0, 2, 3))

def yiq_to_bgr(x):
    transform = np.asarray([[1, -1.106, 1.703], [1, -0.272, -0.647], [1, 0.956, 0.621]], dtype=np.float32)
    n, c, h, w = x.shape
    x = x.transpose((1, 0, 2, 3)).reshape((c, -1))
    x = transform.dot(x)
    return x.reshape((c, n, h, w)).transpose((1, 0, 2, 3))

Run

I actually tried this method. The source code is below. https://github.com/dsanno/chainer-neural-style

Execution method

You can do this as follows:

Perform Color histogram matching

$ python src/run.py -g 0 -c content.jpg -s style.jpg -w 384 -o out_dir_01 --iter 2000 --lr 10 --match_color_histogram --initial_image content

Run Luminance-only

$ python src/run.py -g 0 -c content.jpg -s style.jpg -w 384 -o out_dir_02 --iter 2000 --lr 10 --luminance_only --initial_image content

Execution result

The execution result is shown. Color histogram matching seems to be similar to the content image and the color histogram, but the wall is skin-colored and the proper color is not painted in the right position. The result of the dissertation was a more beautiful color scheme, so it may be something wrong. For Luminance-only, an image similar to the style image is output.

Content image	Style image	Color histogram matching	Luminance-only transfer

Same as above
Same as above
Same as above

In optimizing the style conversion, it seems good to make the initial image the same as the content image. If the initial image is randomized, the light and darkness will deviate from the content image as shown below. (The style image is Van Gogh's Starry night, the method is Luminance-only)

Summary

Implemented image style conversion while preserving color.

This method is a combination of existing style conversion and color conversion of content images and style images, and can be said to be an easy method. In this way, you can get different effects by modifying the method a little, so I thought it was important not only to imitate the new method but also to think about whether you could devise it yourself.

References

Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, 2015, "A Neural Algorithm of Artistic Style"
Leon A. Gatys, Matthias Bethge, Aaron Hertzmann, Eli Shechtman, 2016, Preserving Color in Neural Artistic Style Transfer
Preferred Research Algorithm to convert style

A method of converting the style of an image while preserving the color