Introduction

In image recognition, it is important to visualize what the classifier focused on and recognized. CNN has proposed a method of propagating the gradient of the loss function to the input image by backpropagation and visualizing the intensity of its absolute value, but there was a problem that it was noisy. SmoothGrad is a very easy way to get a clean visualization by simply adding Gaussian noise to the input image and averaging multiple gradients.

How it is averaged

Smooth Grad

Vanilla Grad (conventional method)

The TensorFlow code and papers can be downloaded from the following. https://tensorflow.github.io/saliency/

I also implemented SmoothGrad to study Chainer v2. The model is a trained VGG16 model. The environment is Windows7 64bit Python 3.5.2 |Anaconda 4.2.0 (64-bit) The version of chainer is '2.0.0'. It does not support GPU.

Implementation

import

`smoothgrad.py`


import chainer
import chainer.functions as F
from chainer.variable import Variable
from chainer.links import VGG16Layers
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

config We do it in test mode, but we need to do backpropagation, so Set chainer.config as follows.

`smoothgrad.py`


chainer.config.train=False
chainer.config.enable_backprop=True

VGG16 model road

Load the VGG16 model. The model is about 500MB, so if it has not been downloaded in advance, it will take some time.

`smoothgrad.py`


model = VGG16Layers()

Image loading and preprocessing

The image size of VGG is 224x224, so resize it.

`smoothgrad.py`


image = Image.open("cheetah.png ")
image = image.resize((224,224))

Parameters

Set the number of samplings and noise level. The number of samplings is 100 and the noise level is 20%.

`smoothgrad.py`


sampleSize = 100
noiseLevel = 0.2 # 20%
sigma = noiseLevel*255.0

Gradient calculation

Due to the memory used, we will do it one by one this time. First, in VGG16, the channel arrangement is BGR, so convert it and subtract the average value. Next, while adding image noise, forward propagation and loss are calculated, and back propagation is used to calculate the gradient. Add the calculated gradient to the list.

`smoothgrad.py`


gradList = []
for _ in range(sampleSize):
    x = np.asarray(image, dtype=np.float32)
    # RGB to BGR
    x = x[:,:,::-1]
    #Subtract the average
    x -= np.array([103.939, 116.779, 123.68], dtype=np.float32)
    x = x.transpose((2, 0, 1))
    x = x[np.newaxis]
    #Add noise
    x += sigma*np.random.randn(x.shape[0],x.shape[1],x.shape[2],x.shape[3])    
    x = Variable(np.asarray(x))
    #FP and take out the final layer
    y = model(x, layers=['prob'])['prob']
    #BP with the highest forecast label
    t = np.zeros((x.data.shape[0]),dtype=np.int32)
    t[:] = np.argmax(y.data)
    t = Variable(np.asarray(t))
    loss = F.softmax_cross_entropy(y,t)
    loss.backward()
    #Add gradient to list
    grad = np.copy(x.grad)
    gradList.append(grad)
    #Clear the gradient
    model.cleargrads()

Visualization

Take the maximum absolute value for each channel of the gradient and average it over the image.

`smoothgrad.py`


G = np.array(gradList)
M = np.mean(np.max(np.abs(G),axis=2),axis=0)
M = np.squeeze(M)
plt.imshow(M,"gray")
plt.show()

result

I tried to visualize the original image and the map pixel by pixel while increasing the average number of sheets.

1 sample

2 samples

3 samples

10 samples

20 samples

30 samples

50 samples

75 samples

100 samples

If you visualize from one sample without giving noise, you can see a lot of noise as shown below.

Implemented SmoothGrad with Chainer v2

Introduction

Implementation

smoothgrad.py

smoothgrad.py

VGG16 model road

smoothgrad.py

Image loading and preprocessing

smoothgrad.py

Parameters

smoothgrad.py

Gradient calculation

smoothgrad.py

Visualization

smoothgrad.py

result

`smoothgrad.py`

`smoothgrad.py`

`smoothgrad.py`

`smoothgrad.py`

`smoothgrad.py`

`smoothgrad.py`

`smoothgrad.py`