Preface

Previously, I created a ROS node that runs Tensorflow's trained MNIST.
Introductory article is posted on Qiita and code is posted on GitHub.
This time, I will write the part of the code explanation that could not be written in the above Qiita article.

result

Before we get into the detailed code description, I'll just show you what you can do with it .![Screenshot from 2016-11-23 15_44_07.png](https://qiita-image-store.s3.amazonaws. com / 0/134368 / b426cc2e-e963-0897-f2d0-f83a5ec7a3d0.png)

Handwritten 9 entered from the camera on the right like this The left is the estimation result from the trained CNN, and 9 is seen and 9 is returned properly. This time I wrote the code that allows CNN like this to run on ROS. Please refer to Previous Qiita article for details on execution method and preparation.

Outline of processing

If you review the outline of the process

Configure CNN to read trained files
Subscribe image information from the camera node
Compress image information to 28 * 28 and binarize black and white so that it fits in MNIST CNN.
Show the image to CNN and estimate the numbers
Publish the result

It's like that.

Whole code

The whole code looks like this:

`tensorflow_in_ros_mnist.py`


import rospy
from sensor_msgs.msg import Image
from std_msgs.msg import Int16
from cv_bridge import CvBridge
import cv2
import numpy as np
import tensorflow as tf


def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], 
                      padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')


def makeCNN(x,keep_prob):
    # --- define CNN model
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)

    h_pool1 = max_pool_2x2(h_conv1)

    W_conv2 = weight_variable([3, 3, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

    h_pool2 = max_pool_2x2(h_conv2)

    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])

    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
 
    return y_conv



class RosTensorFlow():
    def __init__(self):
        self._cv_bridge = CvBridge()
        self._sub = rospy.Subscriber('image', Image, self.callback, queue_size=1)
        self._pub = rospy.Publisher('result', Int16, queue_size=1)

        self.x = tf.placeholder(tf.float32, [None,28,28,1], name="x")
        self.keep_prob = tf.placeholder("float")
        self.y_conv = makeCNN(self.x,self.keep_prob)

        self._saver = tf.train.Saver()
        self._session = tf.InteractiveSession()
        
        init_op = tf.initialize_all_variables()
        self._session.run(init_op)

        self._saver.restore(self._session, "model.ckpt")


    def callback(self, image_msg):
        cv_image = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8")
        cv_image_gray = cv2.cvtColor(cv_image, cv2.COLOR_RGB2GRAY)
        ret,cv_image_binary = cv2.threshold(cv_image_gray,128,255,cv2.THRESH_BINARY_INV)
        cv_image_28 = cv2.resize(cv_image_binary,(28,28))
        np_image = np.reshape(cv_image_28,(1,28,28,1))
        predict_num = self._session.run(self.y_conv, feed_dict={self.x:np_image,self.keep_prob:1.0})
        answer = np.argmax(predict_num,1)
        rospy.loginfo('%d' % answer)
        self._pub.publish(answer)

    def main(self):
        rospy.spin()

if __name__ == '__main__':
    rospy.init_node('rostensorflow')
    tensor = RosTensorFlow()
    tensor.main()

Code commentary

** import part **

`tensorflow_in_ros_mnist.py`


import rospy
from sensor_msgs.msg import Image
from std_msgs.msg import Int16
from cv_bridge import CvBridge
import cv2
import numpy as np
import tensorflow as tf

This time I wanted to set up a ROS node in Python, so I added rospy. I also included Image for reading images, Int16 for exporting, cv_bridge for passing ROS message files to OpenCV, OpenCV, Numpy, and Tensorflow for machine learning.

** CNN definition part **

`tensorflow_in_ros_mnist.py`


def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], 
                      padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')


def makeCNN(x,keep_prob):
    # --- define CNN model
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)

    h_pool1 = max_pool_2x2(h_conv1)

    W_conv2 = weight_variable([3, 3, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

    h_pool2 = max_pool_2x2(h_conv2)

    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])

    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
 
    return y_conv

I made a function that defines CNN with the configuration according to Tensorflow Tutorial (Deep MNIST for Expert) .. The model is that after performing convolution pooling twice, a fully connected layer is inserted and the probability for each number is calculated.

** First half of init part of class **

`tensorflow_in_ros_mnist.py`


class RosTensorFlow():
    def __init__(self):
        self._cv_bridge = CvBridge()
        self._sub = rospy.Subscriber('image', Image, self.callback, queue_size=1)
        self._pub = rospy.Publisher('result', Int16, queue_size=1)

The first half is ROS processing. Here, the CvBridge function is called, Subscriber, and Publisher are defined. This time, the Subscriber receives the Image type message, and the Publisher delivers the Int16 type message. Basically as in the ROS example. By including callback in the argument of the sub part, the callback function is called every time an Image message is received.

** Second half of init part of class **

`tensorflow_in_ros_mnist.py`


        self.x = tf.placeholder(tf.float32, [None,28,28,1], name="x")
        self.keep_prob = tf.placeholder("float")
        self.y_conv = makeCNN(self.x,self.keep_prob)

        self._saver = tf.train.Saver()
        self._session = tf.InteractiveSession()
        
        init_op = tf.initialize_all_variables()
        self._session.run(init_op)

        self._saver.restore(self._session, "model.ckpt")

The second half is Tensorflow processing. First, we define x, which is the placeholder that contains the image, and keep_prob, which is the placeholder that contains the DropOut rate. A placeholder is like an entrance to data, and data enters here more and more at runtime.

Next, define the CNN used this time as y_conv. Imagewise, it defines the path that data comes out from the data entrance of x, keep_prob through CNN to the exit named y_conv.

After defining the route, prepare to actually flow the data with the session function. Initialize the CNN weight W and bias b once with the tf.initialize_all_variables function.

Read the learned parameters here. To read the parameters, you need to do saver.restore after using the tf.train.Saver function.

** callback part **

`tensorflow_in_ros_mnist.py`


    def callback(self, image_msg):
        cv_image = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8")
        cv_image_gray = cv2.cvtColor(cv_image, cv2.COLOR_RGB2GRAY)
        ret,cv_image_binary = cv2.threshold(cv_image_gray,128,255,cv2.THRESH_BINARY_INV)
        cv_image_28 = cv2.resize(cv_image_binary,(28,28))
        np_image = np.reshape(cv_image_28,(1,28,28,1))
        predict_num = self._session.run(self.y_conv, feed_dict={self.x:np_image,self.keep_prob:1.0})
        answer = np.argmax(predict_num,1)
        rospy.loginfo('%d' % answer)
        self._pub.publish(answer)

This is read every time an image message comes in. After converting the message to an image with cv_bridge, it is grayscaled, binarized, black and white inverted, and size adjusted, and the image is thrown into CNN. The one with the highest probability of the returned estimation result predict_num is set as answer and published.

The main function is omitted.

Afterword

Now you can handle anything with ROS as long as you have a trained model of Tensorflow. I think FasterRCNN can do object recognition and face recognition, so I'd like to try it.

Explain the code of Tensorflow_in_ROS

Preface

result

Outline of processing

Whole code

tensorflow_in_ros_mnist.py

Code commentary

tensorflow_in_ros_mnist.py

tensorflow_in_ros_mnist.py

tensorflow_in_ros_mnist.py

tensorflow_in_ros_mnist.py

tensorflow_in_ros_mnist.py

Afterword

`tensorflow_in_ros_mnist.py`

`tensorflow_in_ros_mnist.py`

`tensorflow_in_ros_mnist.py`

`tensorflow_in_ros_mnist.py`

`tensorflow_in_ros_mnist.py`

`tensorflow_in_ros_mnist.py`