This article is the 22nd day article of FUN Advent Calendar 2020.
Yesterday was Tomoka-san's Story of running FunLocks, a hackathon that is kind to beginners only on campus. This year, I'm writing articles for part1 and part2 respectively, but both are next to Tomoka-san. I wonder why. Strange.
It is running on Sakura's cloud.
Implement namespace, chroot, cgroup. This time, the cgroup only controls the cpu.
In order to create a container, we need to isolate the namespace. The system call corresponding to this is unshare. A system call with similar functionality is clone. It's a big deal, so let's briefly summarize the differences between unshare and clone before considering which one to use.
unshare Control namespace sharing for the current process.
0 on success, -1 on failure
clone Namespace sharing control is performed at the same time as the child process is created. Image of mixed fork + unshare
There are a lot of things. Unlike unshare, it also creates a child process, so it requires the memory used by the child process and other things. (I haven't checked it because it's annoying, so please check it by yourself)
The thread id of the child process is returned. On failure-1, no child process is created.
It's been a bit long, but this time I'd like to use ** unshare **. The reason is as follows.
--The arguments and return values are easy to understand. --The description of clone says that it is often used for threads.
#define _GNU_SOURCE
#include<sched.h>
#include<unistd.h>
#include<stdio.h>
#include<errno.h>
int main(){
const unsigned int UNSHARE_FLAGS = ( CLONE_FILES | CLONE_NEWIPC | CLONE_NEWUTS | CLONE_NEWPID);
if (unshare(UNSHARE_FLAGS) < 0){
perror("unshare");
}
printf("ok\n");
return 0;
}
I think you can separate the namespace with. I'm only running unshare and I'm only checking that there were no errors, so I'm not sure if it worked at this stage.
The explanation of flag is omitted. It is written in UNSHARE, so please take a look.
After separating the namespace, I will change the root. Use the system call chroot.
chroot Change the root directory. The path specified by the argument will be treated as'/' after that.
0 on success, -1 on error
#include<unistd.h>
#include<errno.h>
#include<stdio.h>
int main(){
char *argv[3];
argv[0] = "/";
argv[1] = NULL;
if( chroot("./test") < 0 ){
perror("chroot");
return 1;
}
if ( chdir("./test") < 0 ){
perror("chdir");
return 1;
}
if ( execve("/bin/ls", argv, NULL) < 0){
perror("execve");
return 1;
}
printf("ok\n");
return 0;
}
Create a directory called test and chroot to it.
Then I wrote the code to execute / bin/ls
.
Since test is an empty directory, of course, it doesn't work because there is no / bin/ls
.
Now you can see that chroot is working fine. (Not a smart confirmation method)
By the way, if I prepared a debian root directory that is not the one on the host machine and chrooted it, it worked fine.
In order to be a container, it is necessary to limit resources. For example, memory usage, CPU usage restrictions, and so on. When it comes to resource limits, there are no system calls, but a Linux feature called cgroups. There are v1 and v2 for cgroup, but this time we will use v2. The detailed explanation and specifications of cgroup v2 will be long, so I will not explain them here. If you are interested, please see here.
To be able to use cgroups, you need to take the following steps:
This time in my environment it was already mounted on / sys/fs/cgroup
so I won't do it.
#include<sys/mount.h>
#include<fcntl.h>
#include<unistd.h>
#include<errno.h>
#include<stdio.h>
int main(){
//make cgroup
if( access("/sys/fs/cgroup/container", F_OK) < 0){
if( mkdir("/sys/fs/cgroup/container", 0644) < 0){
perror("mkdir");
return -1;
}
}
//set pid
int fd;
fd = open("/sys/fs/cgroup/container/cgroup.procs", O_WRONLY);
if( fd < 0 ){
perror("cgroup open");
return -1;
}
int _pid = getpid();
char buff[6];
snprintf(buff, 6 , "%d", _pid);
write(fd, buff, 6);
close(fd);
//set subsystem
fd = open("/sys/fs/cgroup/container/cgroup.subtree_control", O_WRONLY);
if( fd < 0 ){
perror("subsystem open");
return -1;
}
write(fd, "+cpu", 5);
close(fd);
//set cpu max
fd = open("/sys/fs/cgroup/container/cpu.max", O_WRONLY);
if( fd < 0 ){
perror("cpu open");
return -1;
}
write(fd, "10000", 6);
close(fd);
while(1){
fd = open("/dev/null", O_WRONLY);
write(fd,"hello world\n", 12);
close(fd);
}
}
It limits the CPU to 10% for the current process and outputs "hello world" to/dev/null infinitely. (Like a yes command?) Let's check the CPU usage of the process while doing this. You can see that it is fixed at about 10%. Maybe it's working fine.
//set cpu max
fd = open("/sys/fs/cgroup/container/cpu.max", O_WRONLY);
if( fd < 0 ){
perror("cpu open");
return -1;
}
write(fd, "10000", 6);
close(fd);c
There is "10000" in write, but this is CPU time. This time, the maximum CPU time is 100000, so it is 1/10 of "10000". Now you can set the CPU usage.
This time I implemented namespace, chroot, cgroup. If you have any questions about the code, or if there are any differences or mistakes in the behavior, please contact Twitter. I don't know when it will be next time, but I will implement capability.
Container-like # 1 made with C Introduction to Containers Learned with LXC-Technology for Realizing Lightweight Virtualization Environment Man page of UNSHARE Man page of CLONE Man page of CHROOT
Recommended Posts