This article is the 20th day article of LeapMind Advent Calendar 2019.
I will write about how to shorten the time to obtain the result of the model of Deep Learning by using socket and signal for interprocess communication in C language and performing initialization in advance.
I'm usually a ** Python-based Deep Learning engineer **. This time I will use ** C language **. There are many engineers who use C / C ++ at LeapMind, which develops embedded DLs, but I myself rarely develop using C / C ++. ** I'm not very proficient **, so if you have any mistakes, please let me know!
I named it interprocess communication for embedded DL, but it's just interprocess communication! Actually, I wanted to run mnist in C / C ++, but since the amount is large, the inference part is dummy (described later).
By the way, in embedded devices such as the manufacturing industry, there are times when you want to perform processing in a short time such as ** several tens of msec **. If the input is small, even a DL model may be able to execute in tens of msec, If you simply load the DL model created with TensorFlow, it may take ** several seconds ** to start.
In such a case, by launching it as a service in advance and sending a request, You can eliminate the waiting time.
In fact, Python, which is often used in DL, naturally has a socket module, and there is a library that can be richer. There are many, but in a near-embedded environment, you may or may not want to install Python. Therefore, you may be required to develop in ** C / C ++ etc. **.
The environment used this time is as follows.
Ubuntu==16.04
gcc==5.4.0
It should work on another distribution or Mac, and you can do the same on Windows with winsock2.
There are multiple ways to communicate between processes. This time, we will use interprocess communication using socket communication. Also, as a bonus, I would like to write a simple method using signal.
The communication process of client and server is as shown in the figure below. In inference, input a 3x3 matrix and calculate to sum each column. A 3x3 matrix (9 float columns) is sent from the client so that the 3 float columns communicate.
There aren't many books written about sockets, so read the technical blogs on the Internet to get an overview. ** For accurate information, ** [** Linux Programmer's Manual (translation) **](https: / I think it is normal to read and understand /linuxjm.osdn.jp/html/LDP_man-pages/man7/socket.7.html).
Reference articles / materials
server.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#define PORT 52579
#define SIZE 3
void inference(float* input, float* output)
{
for (size_t i=0; i<SIZE; ++i)
{
for (size_t j=0; j<SIZE; ++j)
{
output[i] += input[SIZE*i + j];
}
}
}
int main(void)
{
// create socket
int sock;
sock = socket(AF_INET, SOCK_STREAM, 0);
if (sock < 0)
{
perror("socket");
return 1;
}
// struct about the connection destination
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(PORT);
addr.sin_addr.s_addr = INADDR_ANY;
// bind
if (bind(sock, (struct sockaddr *)&addr, sizeof(addr)) != 0)
{
perror("bind");
return 1;
}
// listen
int backlog = 1;
if (listen(sock, backlog) != 0)
{
perror("listen");
return 1;
} else
{
printf("listen\n");
}
while(1)
{
int status;
// accept socket
struct sockaddr_in client;
socklen_t len = sizeof(client);
int connected_socks = accept(sock, (struct sockaddr *)&client, &len);
if (connected_socks == -1)
{
perror("accept");
} else
{
printf("accepted\n");
}
// recieve
float input[SIZE*SIZE] = {};
status = recv(connected_socks, input, sizeof(input), 0);
if (status == -1)
{
perror("recv");
} else
{
printf("received\n");
}
// print recieved data
printf("[ ");
for (size_t i=0; i<SIZE; ++i)
{
for (size_t j=0; j<SIZE; ++j)
{
printf("%f ", input[SIZE*i + j]);
}
}
printf("]\n");
// inference
float output[SIZE] = {};
inference(input, output);
// send
status = send(connected_socks, output, sizeof(output), 0);
if (status == -1)
{
perror("send");
} else
{
printf("send\n");
}
close(connected_socks);
}
close(sock);
return 0;
}
client.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#include <arpa/inet.h>
#define PORT 52579
#define SIZE 3
int main(void)
{
int status;
// create socket
int sock = socket(AF_INET, SOCK_STREAM, 0);
// struct about the connection destination
struct sockaddr_in server;
server.sin_family = AF_INET;
server.sin_port = htons(PORT);
server.sin_addr.s_addr = inet_addr("127.0.0.1");
// connect server
status = connect(sock, (struct sockaddr *)&server, sizeof(server));
if (status == -1)
{
perror("connect");
} else {
printf("connected\n");
}
// input
float input[SIZE*SIZE] = {
0.f, 1.f, 2.f,
3.f, 4.f, 5.f,
6.f, 7.f, 8.f
};
// send
status = send(sock, input, sizeof(input), 0);
if (status == -1)
{
perror("send");
} else {
printf("send\n");
}
// recieve
float output[SIZE] = {};
status = recv(sock, output, sizeof(output), 0);
if (status == -1)
{
perror("recv");
} else {
printf("received\n");
}
// print received data
printf("[ ");
for(size_t i=0; i<SIZE; ++i)
{
printf("%f ", output[i]);
}
printf("]\n");
// close socket
close(sock);
return 0;
}
gcc -o server server.c
gcc -o client client.c
$ ./server
listen
accepted
received
[ 0.000000 1.000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 ]
send
./client
connected
send
received
[ 3.000000 12.000000 21.000000 ]
TensorFlow has a C / C ++ API. It's a bit quirky, but it should be relatively easy to use after reading the documentation (although it's a bit hard to compile ...). Introducing this alone will be a considerable amount. There are many articles and reference source codes, so please refer to them.
In Tensorflow, memory is allocated in the first execution and the first operation is slow, so by executing empty only for the first time, it will be possible to execute without waiting time after that.
When I want to end the program, I think that I will end the program with ctrl + C
, but it uses the OS function called signal.
You can also stop the program with ctrl + Z
.
When the program is started and initialization such as model loading is completed, it stops by sending a stop signal to itself. By sending a signal from an external program or command, the program can be restarted and executed without waiting time. It's a very primitive method, but sometimes it's good when it's very simple and good for demos.
It is not possible to pass data directly by communication, but for example, it is possible to read directly from the camera and execute it, or to read a file easily.
run_by_signal.c
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
int main(void)
{
printf("Initialize...\n");
// run some initialization here
sleep(1);
pid_t c_pid = getpid();
printf("Send SIGCONT to PID: %d to run\n", c_pid);
while(1)
{
raise(SIGTSTP);
printf("run \n");
}
return 0;
}
send_signal.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <signal.h>
int main(int argc, char *argv[]) {
printf("send continue signal to PID:%s\n", argv[1]);
kill(atoi(argv[1]), SIGCONT);
}
gcc -o run_by_signal run_by_signal.c
gcc -o send_signal send_signal.c
When run_by_signal
is executed, it initializes and stops immediately.
After that, by executing send_signal
, run_by_signal
is restarted, the process is executed (here, only run
is output), and it stops immediately.
When executing send_signal
, specify the process ID of run_by_signal
.
$ ./run_by_signal
Initialize...
Send SIGCONT to PID: 17088 to run
[1]+ Stopped ./run_by_signal
$ ./send_signal 17088
send continue signal to PID:17088
run
[1]+ Stopped ./run_by_signal
You can also use the kill
command to do the same as send_signal
.
Read and understand Linux Programmer's Manual (translation) etc. for accurate information on both this and finally signal. I think that is normal.
I thought, "I only want to use C language because it's all about Python." I wrote about how to shorten the time to get the result of the model of Deep Learning by using socket and signal for interprocess communication and performing initialization in advance.
I've used Tensorflow with the C ++ API, but I don't have a C API so I'd like to try it. Also, Pytorch seems to have a C ++ API, so I'd like to try it soon. It seems to be similar to the Python interface, so I heard from someone who is used to it that it is useful.