As you may be familiar with those who are studying deep learning, there is a paper called [A Neural Algorithm of Artistic Style] that converts the style of images. The explanation in Japanese is below. What is style conversion? We recommend that you take a look here first. Preferred Research Algorithm for converting style
If you perform style conversion with this method, the color of the converted image will be similar to the style image, but recently the style conversion is performed while keeping the color [Preserving Color in Neural Artistic Style Transfer](http: / A paper called /arxiv.org/pdf/1606.05897v1.pdf) has been submitted. This is the same paper by the author as A Neural Algorithm of Artistic Style.
Prepare two images, a content image (for example, a cat image) and a style image (for example, a painting drawn by Van Gogh). From these two images, the content is a content image, and the style is similar to a style image (in this case, a cat image like Van Gogh's drawing) is called style conversion.
The paper proposes two types of methods.
Converts the style image so that the mean and covariance of the RGB values are the same as those of the content image. First, define the symbol.
symbol | meaning | dimension |
---|---|---|
Content image | 3 x (Content imageのピクセル数), Each line is R, G,B value vector | |
Style image | 3 x (Style imageのピクセル数) | |
RGB average value of content image | 3-vector | |
RGB average value of style image | 3-vector | |
RGB covariance of content images | 3 x 3 | |
RGB covariance of style images | 3 x 3 |
Style image conversion
\bf{x}_{S^{\prime}} \leftarrow \bf{A}\bf{x}_{S}+\bf{b}
Think about. To make the color histogram of the style conversion after conversion closer to the color histogram of the content image
\bf{\mu}_{S^{\prime}} = \bf{\mu}_{C}
\bf{\Sigma}_{S^{\prime}} = \bf{\Sigma}_{C}
Given $ \ bf {A} $ and $ \ bf {b} $ that satisfy
\bf{b} = \bf{\mu}_{C}-\bf{A}\bf{\mu}_{S}
\bf{A}\bf{\Sigma_{S}}\bf{A^{T}} = \bf{\Sigma_{C}}
Meet. The paper lists two $ \ bf {A} $ that satisfy this formula.
\bf{A}_{\rm chol} = \bf{L}_{C}\bf{L}_{S}^{-1}
However, $ \ bf {L} $ satisfies $ \ bf {\ Sigma} = \ bf {L} \ bf {L} ^ {T} $ by the Cholesky decomposition of $ \ bf {\ Sigma} $.
Suppose you decompose $ \ bf {\ Sigma} $ into eigenvalues to get $ \ bf {\ Sigma} = \ bf {U} \ bf {\ Lambda} \ bf {U} ^ {T} $. At this time, $ \ bf {\ Sigma} ^ {1/2} = \ bf {U} \ bf {\ Lambda} ^ {1/2} \ bf {U} ^ {T} $, $ \ bf {\ Sigma } ^ {-1 / 2} = \ bf {U} \ bf {\ Lambda} ^ {-1 / 2} \ bf {U} ^ {T} $. And
\bf{A}_{\rm IA} = \bf{\Sigma_{C}^{1/2}}\bf{\Sigma_{S}^{-1/2}}
Is used for conversion.
Use the obtained $ \ bf {A} $ and $ \ bf {b} $ to convert the style image, and then use the content image and the converted style image to perform normal style conversion. The paper states that the eigenvalue decomposition gave better results than the Cholesky decomposition.
I tried to convert the style image using an algorithm that uses eigenvalue decomposition. You can see that the color scheme is similar to the content image.
Content image | Style image | Image with color histogram close |
---|---|---|
Converts the color space of content and style images to YIQ to get brightness-only content and style images. At this time, the following conversion is performed to make the average and variance of the brightness uniform. $ L_ {C} $ is the brightness of the content image, $ L_ {S} $ is the brightness of the style image, $ \ mu_ {C}, \ mu_ {S} $ is the average brightness of the content image and the style image, respectively. $ \ Sigma_ {C}, \ sigma_ {S} $ are standard deviations.
L_{S^{\prime}} = \frac{\sigma_{C}}{\sigma_{S}}(L_{S}-\mu_{S})+\mu_{C}
After performing style conversion using these images, you can obtain the converted image by combining it with the IQ channel of the content image. How to convert RGB to YIQ can be found in Wikipedia.
Color histogram matching
The Python implementation of Color histogram matching using eigenvalue decomposition is as follows.
You can do eigenvalue decomposition with numpy.linalg.eig
.
import numpy as np
import six
#x is the style image
#y is the style image
# x,y is a numpy array and shape is(batch_size, 3, height, width)
def match_color_histogram(x, y):
z = np.zeros_like(x)
shape = x[0].shape
for i in six.moves.range(len(x)):
a = x[i].reshape((3, -1))
a_mean = np.mean(a, axis=1, keepdims=True)
a_var = np.cov(a)
d, v = np.linalg.eig(a_var)
a_sigma_inv = v.dot(np.diag(d ** (-0.5))).dot(v.T)
b = y[i].reshape((3, -1))
b_mean = np.mean(b, axis=1, keepdims=True)
b_var = np.cov(b)
d, v = np.linalg.eig(b_var)
b_sigma = v.dot(np.diag(d ** 0.5)).dot(v.T)
transform = b_sigma.dot(a_sigma_inv)
z[i,:] = (transform.dot(a - a_mean) + b_mean).reshape(shape)
return z
The mutual conversion between BGR and YIQ is as follows. BGR is used instead of RGB because the VGG model used for image conversion uses BGR as input.
#x is a numpy array representing an image and shape is(batch_size, 3, height, width)
def bgr_to_yiq(x):
transform = np.asarray([[0.114, 0.587, 0.299], [-0.322, -0.274, 0.596], [0.312, -0.523, 0.211]], dtype=np.float32)
n, c, h, w = x.shape
x = x.transpose((1, 0, 2, 3)).reshape((c, -1))
x = transform.dot(x)
return x.reshape((c, n, h, w)).transpose((1, 0, 2, 3))
def yiq_to_bgr(x):
transform = np.asarray([[1, -1.106, 1.703], [1, -0.272, -0.647], [1, 0.956, 0.621]], dtype=np.float32)
n, c, h, w = x.shape
x = x.transpose((1, 0, 2, 3)).reshape((c, -1))
x = transform.dot(x)
return x.reshape((c, n, h, w)).transpose((1, 0, 2, 3))
I actually tried this method. The source code is below. https://github.com/dsanno/chainer-neural-style
You can do this as follows:
$ python src/run.py -g 0 -c content.jpg -s style.jpg -w 384 -o out_dir_01 --iter 2000 --lr 10 --match_color_histogram --initial_image content
$ python src/run.py -g 0 -c content.jpg -s style.jpg -w 384 -o out_dir_02 --iter 2000 --lr 10 --luminance_only --initial_image content
The execution result is shown. Color histogram matching seems to be similar to the content image and the color histogram, but the wall is skin-colored and the proper color is not painted in the right position. The result of the dissertation was a more beautiful color scheme, so it may be something wrong. For Luminance-only, an image similar to the style image is output.
Content image | Style image | Color histogram matching | Luminance-only transfer |
---|---|---|---|
Same as above | |||
Same as above | |||
Same as above |
In optimizing the style conversion, it seems good to make the initial image the same as the content image. If the initial image is randomized, the light and darkness will deviate from the content image as shown below. (The style image is Van Gogh's Starry night, the method is Luminance-only)
Implemented image style conversion while preserving color.
This method is a combination of existing style conversion and color conversion of content images and style images, and can be said to be an easy method. In this way, you can get different effects by modifying the method a little, so I thought it was important not only to imitate the new method but also to think about whether you could devise it yourself.
Recommended Posts