Common misconceptions about Java application resource limitations when using Kubernetes

In the first article in this series, we'll look at some of the common misconceptions about limiting Java application resources when using ** Kubernetes **.

In this series of articles, we will explore some of the common issues that corporate customers encounter when using Kubernetes (https://www.alibabacloud.com/en/product/kubernetes).

As Container Technology (https://www.alibabacloud.com/en/product/container-service) becomes more sophisticated, more and more enterprise customers are choosing Docker and Kubernetes as the foundation of their application platform. However, these customers are actually running into many problems. In this series of articles, we'll share some insights and best practices derived from the experience of Alibaba Cloud's container services team, which has helped customers navigate this process.

For containerized deployments of Java applications, there are reports that the active Java application container is mysteriously killed by OOM Killer, even though you have set container resource limits.

This issue is the result of a very common mistake of not setting the container resource limit and the corresponding JVM heap size correctly.

Let's take the Tomcat application as an example. Its instance code and Kubernetes deployment files are available on GitHub (https://github.com/denverdino/system-info?spm=a2c65.11461447.0.0.19b432f1QUkwgh).

git clone https://github.com/denverdino/system-info
cd system-info`

It uses the following Kubernetes pod definition.

  1. The app in the pod is the initialization container and is responsible for copying one JSP application to the "webapps" directory of the Tomcat container. Note: In the image, the JSP application index.jsp is used to display the JVM and system resource information.
  2. The Tomcat container remains active and limits the maximum memory usage to 256MB.
apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  initContainers:
  - image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info
    name: app
    imagePullPolicy: IfNotPresent
    command:
      - "cp"
      - "-r"
      - "/system-info"
      - "/app"
    volumeMounts:
    - mountPath: /app
      name: app-volume
  containers:
  - image: tomcat:9-jre8
    name: tomcat
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - mountPath: /usr/local/tomcat/webapps
      name: app-volume
    ports:
    - containerPort: 8080
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "256Mi"
        cpu: "500m"
  volumes:
  - name: app-volume
    emptyDir: {}

Run the following command to deploy and test your application.

$ kubectl create -f test.yaml
pod "test" created
$ kubectl get pods test
NAME      READY     STATUS    RESTARTS   AGE
test      1/1       Running   0          28s
$ kubectl exec test curl http://localhost:8080/system-info/
...

Information such as the system CPU and memory is now displayed in HTML format. You can use the html2text command to convert the information to text format.

Note: We are testing the application on a 2C 4G node. Testing in different environments may give slightly different results.

$ kubectl exec test curl http://localhost:8080/system-info/ | html2text

Java version     Oracle Corporation 1.8.0_162
Operating system Linux 4.9.64
Server           Apache Tomcat/9.0.6
Memory           Used 29 of 57 MB, Max 878 MB
Physica Memory   3951 MB
CPU Cores        2
                                          **** Memory MXBean ****
Heap Memory Usage     init = 65011712(63488K) used = 19873704(19407K) committed
                      = 65536000(64000K) max = 921174016(899584K)
Non-Heap Memory Usage init = 2555904(2496K) used = 32944912(32172K) committed =
                      33882112(33088K) max = -1(-1K)

As you can see, the system memory in the container is 3,951MB, but the maximum heap size of the JVM is 878MB. Why is this happening? Did you not set the resource capacity of the container to 256MB? In this situation, the application's memory usage is over 256MB, but the JVM does not implement garbage collection (GC). Rather, the system's OOM killer is killing the JVM process directly.

Root cause of the problem:

  1. If you do not set the JVM heap size, the maximum heap size will be set by default based on the memory size of the host environment.
  2. The Docker container uses cgroups to limit the resources used by the process. Therefore, if the JVM in the container is still using the default settings based on the memory and CPU cores of the host environment, this will result in incorrect JVM heap calculations.

Similarly, the default JVM GC and JIT compiler thread count is determined by the number of host CPU cores. If you run multiple Java applications on a single node, even if you set CPU limits, the GC threads can anticipate switching between applications, which can impact application performance.

Now that we know the root cause of this problem, it's easy to solve.

solution

Enable resource recognition for group c

The Java community is also aware of this issue and now supports automatic detection of container resource limits in Java SE 8u131 + and JDK 9: [https://blogs.oracle.com/java-platform-group/java] -se-support-for-docker-cpu-and-memory-limits](https://blogs.oracle.com/java-platform-group/java-se-support-for-docker-cpu-and-memory- limits? spm = a2c65.11461447.0.0.19b432f1RHoeWq)

To use this method, add the following parameters:

java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap ⋯

Continuing from the previous Tomcat container example, add the environment variable" JAVA_OPTS ".

apiVersion: v1
kind: Pod
metadata:
  name: cgrouptest
spec:
  initContainers:
  - image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info
    name: app
    imagePullPolicy: IfNotPresent
    command:
      - "cp"
      - "-r"
      - "/system-info"
      - "/app"
    volumeMounts:
    - mountPath: /app
      name: app-volume
  containers:
  - image: tomcat:9-jre8
    name: tomcat
    imagePullPolicy: IfNotPresent
    env:
    - name: JAVA_OPTS
      value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap"
    volumeMounts:
    - mountPath: /usr/local/tomcat/webapps
      name: app-volume
    ports:
    - containerPort: 8080
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "256Mi"
        cpu: "500m"
  volumes:
  - name: app-volume
    emptyDir: {}

Now deploy the new pod and repeat the test.

$ kubectl create -f cgroup_test.yaml
pod "cgrouptest" created

$ kubectl exec cgrouptest curl http://localhost:8080/system-info/ | html2txt
Java version     Oracle Corporation 1.8.0_162
Operating system Linux 4.9.64
Server           Apache Tomcat/9.0.6
Memory           Used 23 of 44 MB, Max 112 MB
Physica Memory   3951 MB
CPU Cores        2
                                          **** Memory MXBean ****
Heap Memory Usage     init = 8388608(8192K) used = 25280928(24688K) committed =
                      46661632(45568K) max = 117440512(114688K)
Non-Heap Memory Usage init = 2555904(2496K) used = 31970840(31221K) committed =
                      32768000(32000K) max = -1(-1K)

As you can see, the maximum JVM heap size has been changed to 112MB so that the application is not killed by the OOM killer. But this raises another issue. Why set the maximum JVM heap memory to only 112MB when I set the maximum container memory limit to 256MB?

The answer has to do with the details of JVM memory management. Memory consumption in the JVM includes both heap memory and non-heap memory. Memory required for class metadata, JIT-compliant code, thread stacks, GC, etc. is retrieved from non-heap memory. Therefore, based on cgroup resource limits, the JVM reserves some of its memory for non-heap to ensure system stability. (In the previous example, after starting Tomcat, you can see that the non-heap memory occupies nearly 32MB).

The latest version of JDK 10 has been further optimized and enhanced for JVM operations inside the container.

Awareness of cgroup resource limits in containers

If you can't use the new features in JDK 8 and 9 (for example, if you're still using an older JDK 6 application), use a script inside the container to get the container's cgroup resource limits and use this to get the JVM's You can set the heap size.

Starting with Docker 1.7, the container's cgroup information is mounted inside the container, allowing applications to get settings such as memory and CPU from files such as /sys/fs/cgroup/memory/memory.limit_in_bytes. Therefore, the application launch command in the container contains the correct resource settings based on the cgroup settings, such as -Xmx, -XX: ParallelGCThreads.

Conclusion

In this article, we'll look at some common heap configuration issues that occur when running Java applications in containers. Containers, unlike virtual machines, have resource limits implemented using c-groups. In addition, memory and CPU allocations can cause resource contention and problems if the internal container process is unaware of cgroup limits.

This issue can be resolved very easily by using new JVM features and custom scripts to set resource limits correctly. These solutions address most of the resource limitation issues.

However, these solutions leave one resource limitation issue that affects container applications unsolved. Some older monitoring tools and system commands such as "free" and "top" still get the host CPU and memory settings when running inside the container. This means that resource consumption cannot be calculated accurately when certain monitoring tools are running inside the container. A common solution to this problem proposed by the community is LXCFS To maintain consistency between container resource visualization behavior and virtual machines. spm = a2c65.11461447.0.0.19b432f1eicljw) is to be used. Subsequent articles will discuss using this method on Kubernetes.

Alibaba Cloud Kubernetes Service (https://www.alibabacloud.com/en/product/kubernetes) is the first Kubernetes-certified service. It simplifies lifecycle management for Kubernetes clusters and provides embedded integration into Alibaba Cloud products. In addition, the service further optimizes the Kubernetes developer experience, freeing users to focus on the value of cloud applications and further innovation.

Recommended Posts

Common misconceptions about Java application resource limitations when using Kubernetes
Unexpected exception when using Java DateTimeFormatter
Try using Java framework Nablarch [Web application]
About Spring Dependency Injection using Java, Kotlin
ERRORCODE = -4471 occurs in Java application using Db2.