I can no longer connect to a VM with a Docker container that can be connected via SSH

(2020/09/25) This article is an additional correction of the troubleshooting part extracted from the following article. Create a Docker container that can be connected to CentOS 8 with the minimum configuration by SSH

Overview

After playing with a Docker container that can connect to SSH for a while, I encountered a problem that the VM containing the container could not connect to SSH, so I will summarize the investigation and solution.

environment

--Windows10 Home (1909) (host)

Problems and solutions

In this article, device information is added to the beginning of the command line to clarify where the command is executed. The one starting with [centos @ dockertest ~] $ indicates the command execution on the guest VM, and the one starting with test ~: $ indicates the command execution on the container.

a problem occured

After playing for a while with a script for SSH access to the container I set up, I suddenly couldn't make an SSH connection from the host to the guest VM. Specifically, when I try to access with TeraTerm, the screen remains black and there is no response. Also, when I try to access the guest VM with the container from another guest VM with ssh, the following message appears and the connection is flipped.

[centos@othervm ~]$ ssh -p 2222 [email protected]
shell request failed on channel 0

Survey

When I googled based on the above message (shell request failed on channel 0), the following article was a hit.

Reference: How to deal with "shell request failed on channel 0" that cannot log in to ssh

Although there are differences such as the environment, I will try to find out which process is running, assuming that the surroundings of the process are suspicious.

[centos@dockertest ~]$ ps aux | grep ssh
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
(~ Omitted ~)
22        7762  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7764  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7766  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7768  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7770  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7772  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7774  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7776  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7778  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7780  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7782  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7784  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7786  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
22        7788  0.0  0.0      0     0 ?        Z    02:27   0:00 [sshd] <defunct>
(~ Omitted ~)

Oh, oh ...

From the notation " Z "and" defunct", it seems that a large number of ssh processes have become zombies and are eating up the available process IDs. If you restart the guest VM or container, you will be able to reconnect once, but since the zombie process will increase every time you connect with SSH, it will not be a permanent countermeasure and you will have a time bomb.

[centos@dockertest ~]$ ps aux | grep Z | wc -l
2
(During this time, SSH connection from the host to the container with TeraTerm ⇒ Execute disconnection)
[centos@dockertest ~]$ ps aux | grep Z | wc -l
3

Cause

Upon examination, it seems that in Linux, the process init is generally the ancestor of all processes, which manages the process well and usually prevents zombie processes from spawning. .. However, since the init process does not exist in the Docker container created this time, you cannot benefit from process management. As a result, it seems that the ssh process once connected loses its place and becomes a zombie after it has finished its role, and it is piled up and eventually a new ssh process cannot be started.

By the way, init seems to be assigned the number 1 (PID 1) as the process ID, but the container PID 1 is assigned the tail described in the CMD instruction in the Dockerfile.

test:~$ ps
PID   USER     TIME  COMMAND
    1 root      0:00 tail -f /dev/null
  ...

For the detailed principle of the zombie process, the following article was helpful.

Reference: Unix process and Docker trap

solution

There seem to be various methods, but since I learned about init during the research process, I would like to solve it in the direction of making use of this. If Docker Compose is version 3.7 or higher, adding init: true to docker-compose.yml will make init work in the container.

Reference: About the --init flag of Docker

docker-compose.yml(Revised)


version: '3.8'

services:
  test:
    build: .
    container_name: test
    hostname: test
    ports:
      - "2222:22"
    tty: true
    init: true #Add this line

Actually rewrite the file as described above and start the container again.

[centos@dockertest ~]$ docker-compose down
[centos@dockertest ~]$ docker-compose build
[centos@dockertest ~]$ docker-compose up -d

When the container starts up, SSH from the host to the container with TeraTerm again and check the process.

test:~$ ps
PID   USER     TIME  COMMAND
    1 root      0:00 /sbin/docker-init -- /bin/sh -c /etc/init.d/sshd start && tail -f /dev/null
  ...

Oh, the COMMAND part has changed properly. Just in case, connect with SSH and make sure that the zombie process does not increase.

[centos@dockertest ~]$ ps aux | grep Z | wc -l
2
(During this time, SSH connection from the host to the container with TeraTerm ⇒ Execute disconnection)
[centos@dockertest ~]$ ps aux | grep Z | wc -l
2

Is it good?

Summary

--Let's run init when creating a Docker container that can be SSHed --Be careful of zombie process proliferation when running Docker container

I didn't really care about process management, so I learned a lot. In the first place, there is a problem of whether a container that can be connected by SSH is the idea of Docker, but I would like to study that area one by one.

Recommended Posts

I can no longer connect to a VM with a Docker container that can be connected via SSH
Let's create a Docker container that can connect to CentOS 8 with the minimum configuration by SSH
I made a Docker container to run Maven
I tried to easily put CentOS-7 in a PC that I no longer need
[Android] I want to create a ViewPager that can be used for tutorials
Ssh connect from container to git with VSCode Remote Container
Create a page control that can be used with RecyclerView
Find a Switch statement that can be converted to a Switch expression
I tried to create a padrino development environment with Docker
After updating to OSX Catalina, I can no longer sass
I was angry with proc_open (): fork failed when trying to composer update inside a Docker container
A story that I struggled to challenge a competition professional with Java
A little happy that Nginx's Docker container defaults to graceful shutdown
I made a question that can be used for a technical interview
I tried deploying a Docker container on Lambda with Serverless Framework
Introduction to Java that can be understood even with Krillin (Part 1)
How to quickly create a reverse proxy that supports HTTPS with Docker
I tried learning Java with a series that beginners can understand clearly
I tried to make a Web API that connects to DB with Quarkus
I tried to build a Firebase application development environment with Docker in 2020
I used Docker to solidify the template to be developed with spring boot.
How to start a Docker container with a volume mounted in a batch file
Connect to Amazon EC2 with SSH (Ubuntu)
Connect with VS Code from a Windows client to Docker on another server
I tried to create a portfolio with AWS, Docker, CircleCI, Laravel [with reference link]
I tried to make a machine learning application with Dash (+ Docker) part3 ~ Practice ~
I want to be able to read a file using refile with administrate [rails6]
I made a THETA API client that can be used for plug-in development
[Java] I tried to connect using a connection pool with Servlet (tomcat) & MySQL & Java
I was addicted to not being able to connect to AWS-S3 from the Docker container
[Part 1] Creating a Docker container that delivers Markdown in HTML with Apache / Pandoc
How to solve when you cannot connect to DB with a new container because the port is assigned to the existing docker container
I used Docker for my portfolio as a beginner, so I hope that even 1mm will be helpful to someone.