(2020/09/25) This article is an additional correction of the troubleshooting part extracted from the following article. Create a Docker container that can be connected to CentOS 8 with the minimum configuration by SSH
After playing with a Docker container that can connect to SSH for a while, I encountered a problem that the VM containing the container could not connect to SSH, so I will summarize the investigation and solution.
--Windows10 Home (1909) (host)
In this article, device information is added to the beginning of the command line to clarify where the command is executed.
The one starting with [centos @ dockertest ~] $
indicates the command execution on the guest VM, and the one starting with test ~: $
indicates the command execution on the container.
After playing for a while with a script for SSH access to the container I set up, I suddenly couldn't make an SSH connection from the host to the guest VM.
Specifically, when I try to access with TeraTerm, the screen remains black and there is no response.
Also, when I try to access the guest VM with the container from another guest VM with ssh
, the following message appears and the connection is flipped.
[centos@othervm ~]$ ssh -p 2222 [email protected]
shell request failed on channel 0
When I googled based on the above message (shell request failed on channel 0
), the following article was a hit.
Reference: How to deal with "shell request failed on channel 0" that cannot log in to ssh
Although there are differences such as the environment, I will try to find out which process is running, assuming that the surroundings of the process are suspicious.
[centos@dockertest ~]$ ps aux | grep ssh
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
(~ Omitted ~)
22 7762 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7764 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7766 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7768 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7770 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7772 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7774 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7776 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7778 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7780 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7782 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7784 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7786 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
22 7788 0.0 0.0 0 0 ? Z 02:27 0:00 [sshd] <defunct>
(~ Omitted ~)
Oh, oh ...
From the notation " Z
"and" defunct
", it seems that a large number of ssh
processes have become zombies and are eating up the available process IDs.
If you restart the guest VM or container, you will be able to reconnect once, but since the zombie process will increase every time you connect with SSH, it will not be a permanent countermeasure and you will have a time bomb.
[centos@dockertest ~]$ ps aux | grep Z | wc -l
2
(During this time, SSH connection from the host to the container with TeraTerm ⇒ Execute disconnection)
[centos@dockertest ~]$ ps aux | grep Z | wc -l
3
ps
command output and the execution command itself (grep Z
), so if the result is "3" or higher, it means that a zombie process has occurred.Upon examination, it seems that in Linux, the process init
is generally the ancestor of all processes, which manages the process well and usually prevents zombie processes from spawning. ..
However, since the init
process does not exist in the Docker container created this time, you cannot benefit from process management. As a result, it seems that the ssh
process once connected loses its place and becomes a zombie after it has finished its role, and it is piled up and eventually a new ssh
process cannot be started.
By the way, init
seems to be assigned the number 1 (PID 1) as the process ID, but the container PID 1 is assigned the tail
described in the CMD
instruction in the Dockerfile
.
test:~$ ps
PID USER TIME COMMAND
1 root 0:00 tail -f /dev/null
...
For the detailed principle of the zombie process, the following article was helpful.
Reference: Unix process and Docker trap
There seem to be various methods, but since I learned about init
during the research process, I would like to solve it in the direction of making use of this.
If Docker Compose is version 3.7 or higher, adding init: true
to docker-compose.yml
will make init
work in the container.
Reference: About the --init flag of Docker
docker-compose.yml(Revised)
version: '3.8'
services:
test:
build: .
container_name: test
hostname: test
ports:
- "2222:22"
tty: true
init: true #Add this line
Actually rewrite the file as described above and start the container again.
[centos@dockertest ~]$ docker-compose down
[centos@dockertest ~]$ docker-compose build
[centos@dockertest ~]$ docker-compose up -d
When the container starts up, SSH from the host to the container with TeraTerm again and check the process.
test:~$ ps
PID USER TIME COMMAND
1 root 0:00 /sbin/docker-init -- /bin/sh -c /etc/init.d/sshd start && tail -f /dev/null
...
Oh, the COMMAND
part has changed properly.
Just in case, connect with SSH and make sure that the zombie process does not increase.
[centos@dockertest ~]$ ps aux | grep Z | wc -l
2
(During this time, SSH connection from the host to the container with TeraTerm ⇒ Execute disconnection)
[centos@dockertest ~]$ ps aux | grep Z | wc -l
2
Is it good?
--Let's run init
when creating a Docker container that can be SSHed
--Be careful of zombie process proliferation when running Docker container
I didn't really care about process management, so I learned a lot. In the first place, there is a problem of whether a container that can be connected by SSH is the idea of Docker, but I would like to study that area one by one.
Recommended Posts