Even in modern times, there are still many web applications that use Tomcat. (It is not a criticism, it is used in many services, version upgrades are continued, and it is a product with a lot of information, so I think it is an excellent application server.)
If the development company builds the environment, deploys the system built in Java on the server, and operates it for a while, problems may inevitably occur in terms of performance.
For example, it is good to output the log, but because it does not consider the rotation and file destruction cycle, it puts pressure on the storage and suddenly the stone and the server go down, and the heap memory usage gradually rises. Then, Tomcat dies due to OutOfMemoryError etc., and even if it does not die, the system response deteriorates and troubles occur, and various other things occur.
If you find that you're squeezing your storage, you'll often delete files that you deem unnecessary and ** interimly ** and ** permanent again **.
If the response of the system is bad and it seems that the memory is the cause after a light investigation, if the business impact is large if it is not resolved immediately, restart Tomcat and ** provisional support ** and ** permanent support. ** The cause investigation to do is said to be something like If you try to reproduce it again in the magic word of the development company
, I will investigate it at that time`, and ** Permanent correspondence ** will not come forever .. (It is a personal experience and does not apply to all of the world.)
I mentioned something like a bittersweet at length, but memory usage analysis may not be known until the problem actually occurs, and the situation is also in the production environment and the development environment that the development company has. I understand that it is difficult to reproduce because it is different. The current situation is that maintenance contracts are being reluctantly narrowed down by user companies, and it is not the situation that time-consuming investigation work is required.
If so, the user companies should investigate by themselves. With that in mind, I'm leaving this article with the intention of summarizing my own methods for research and analysis.
In this article, I will use various commands and investigate, but sometimes I need to know the pid of Tomcat in advance. There are several ways to check the pid, which you can check with the following command.
If the only process that uses Java is Tomcat, you can check it with the command pidof
.
# pidof java
3344
If you're not sure if it's only Tomcat, you should use jcmd
to list it and determine the Tomcat PID.
# jcmd -l
3344 org.apache.catalina.startup.Bootstrap start ← This is Tomcat PID
4894 jdk.jcmd/sun.tools.jcmd.JCmd -l
Specify the PID of Tomcat and check the usage status such as memory with the top command.
# top -p 3344
top - 11:22:30 up 23 min, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 3939.0 total, 3295.9 free, 273.2 used, 369.9 buff/cache
MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 3445.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3344 root 20 0 4204172 190864 28536 S 0.0 4.7 0:03.04 java
In the Tomcat startup options (JVM startup options), you may specify -Xms
, -Xmx
, etc., and set the memory size used by Tomcat.
As the setting method
--Set the environment variable of CATALINA_OPTS
to export to ~ / bash_profile
of tomcat user
--How to specify in the startup script of tomcat.service
--How to specify in {tomcat-root} /bin/setenv.sh
If there is no such thing as Tomcat's environment construction procedure and materials, scripts built with Ansible etc., Sometimes it is troublesome to check what is set to start. (Especially if you ask the development company to build, develop, and release the environment and take care of the server as a user company, it is especially troublesome.)
In such a case, you can use jcmd to check what kind of JVM option the currently running Tomcat is running for the time being.
You will need the Tomcat PID in advance to verify.
# jcmd -l
899 org.apache.catalina.startup.Bootstrap start
4894 jdk.jcmd/sun.tools.jcmd.JCmd -l
# jcmd 899 VM.flags
899:
-XX:CICompilerCount=2 -XX:InitialHeapSize=62914560 -XX:MaxHeapSize=991952896 -XX:MaxNewSize=330301440 -XX:MinHeapDeltaBytes=524288 -XX:NewSize=20971520 -XX:OldSize=41943040 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseFastUnorderedTimeStamps -XX:+UseParallelGC
It is possible to check how the parameter settings are, such as -XX: MaxHeapSize
, which corresponds to -Xmx
.
A heap dump is like a snapshot log of Java's memory usage. By taking a heap dump and analyzing it, you can somehow understand how Java is using memory. Here, a heap dump is acquired using jcmd that can be used from Java 7.
# jcmd -l
899 org.apache.catalina.startup.Bootstrap start
4894 jdk.jcmd/sun.tools.jcmd.JCmd -l
# jcmd 899 GC.heap_dump /var/log/heapdump/heapdmp.hprof
899:
Heap dump file created
# ls
heapdmp.hprof
Check the heap dump acquired by jcmd with a tool called Java Visual VM. To analyze, it is done in the local PC environment (environment with GUI), but JRE must be installed in advance.
In my environment, Java 1.8 was set up and the path was in place.
$ java -version
java version "1.8.0_241"
Java(TM) SE Runtime Environment (build 1.8.0_241-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)
In this environment, execute the following command to start the Java Visual VM tool.
$ jvisualvm
Specify and open the heap dump file obtained from Open File.
It seems that you can check if extra objects are created from the heap dump. However, my expectation was that it would be nice to be able to see the memory usage status and how the memory is released by GC on the graph. Therefore, the memory usage status can be grasped only by this heap dump, the cause can be identified, and it can be used as a trigger for permanent response. I thought it would be difficult.
It seems that it is possible to graphically analyze the memory usage status etc. using a tool called JConsole. https://docs.oracle.com/javase/jp/8/docs/technotes/guides/management/jconsole.html
Here, the JVM is monitored by JConsole via the JMX agent for Tomcat built on the CentOS server from the local PC.
I think there are various ways to specify the JVM startup option of Tomcat, but here, add the setting by the method specified in setenv.sh.
{tomcat-root}/bin/setenv.sh
#!/bin/sh
JAVA_OPTS="-server -Xms512M -Xmx2G -Dspring.profiles.active=production -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8888 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=192.168.33.10"
Dcom.sun.management.jmxremote.port
specifies the port used by the JMX agent.
Assuming that Tomcat is using port 8080, I specified a port other than that.
Also, in order to remotely access the Tomcat server from the local PC and perform JVM monitoring, the parameter of Djava.rmi.server.hostname
is specified.
If you don't specify this parameter, when you try to connect to 192.168.33.10:8888
in JConsole, you'll get a message like" Secure connection failed. Do you want to retry insecure? " I tried but couldn't connect.
So be careful if you need remote access.
After making the above settings and restarting Tomcat, check if the specified port 8888
is properly listened to.
# systemctl restart tomcat
# ss -antup | grep 8888
tcp LISTEN 0 50 *:8888 *:* users:(("java",pid=4211,fd=17))
# ps -ef | grep jmx
Start the jconsole tool with the following command and check the memory usage. The following command is executed from the local PC. Please note that the jconsole command cannot be executed unless the JRE is pre-installed and the path is not passed.
jconsole 192.168.33.10:8888
The Overview tab provides a graphical view of how memory, threads, classes, CPU usage, etc. are being used over time. It seems that the information on this overview tab is enough to analyze, but let's check what the contents of the other tabs are.
GCViewer https://github.com/chewiebug/GCViewer/wiki/Changelog
JConsole's analysis could confirm the number of times GC occurred, but did not know when GC occurred. So, get the GC log and use GCViewer to check the frequency of GC occurrence and how much memory is released by GC, along with the memory usage status.
To get the GC log, set the Tomcat JVM option as follows:
{tomcat-root}/bin/setenv.sh
JAVA_OPTS="-server -Xms512M -Xmx2G -Dspring.profiles.active=production -verbose:gc -Xlog:gc*=info:file=/var/log/gclog/gc.log:time,uptime,level,tags:filecount=1,filesize=10m"
After completing the above settings, restart the tomcat service and leave it for a while. (Get the web app on Tomcat to work for a while.)
When I display the acquired log with gcviewer, it looks like the following.
It's displayed in various ways, and I'm not sure which color plays what role. You can show / hide what each color looks like from View of the menu.
The implications of each are briefly explained below.
--Full GC Lines: (black) --Line representing Full GC --Inc GC Lines: (Cyan) --Line representing Incremental GC --GC Times Line: (green) --A line showing the time taken for GC
-Xms
and -Xmx
, you can also display and check only this Total Heap.
--Tenured Generation: (pink)
--There is a Young Generation / Tenured Generation as a breakdown of the heap area.
--Also expressed as Young area and Old area, it expresses the memory size corresponding to the Old area.
--Used when analyzing the memory status using Young Generation / Tenured Generation without using Total Heap.
--Cannot be used unless the JVM parameter -PrintGCDetails
is specified, but Java 9 or later requires the parameter specification of-Xlog: gc *
.
--Young Generation: (Orange)
--There is a Young Generation / Tenured Generation as a breakdown of the heap area.
--Also expressed as Young area and Old area, it expresses the memory size corresponding to Young area.
--Used when analyzing the memory status using Young Generation / Tenured Generation without using Total Heap.
--Cannot be used unless the JVM parameter -PrintGCDetails
is specified, but Java 9 or later requires the parameter specification of-Xlog: gc *
.
--Used Heap: (blue)
--The blue line represents the heap usage size.
--Used Tenured Size: (dark pink)
--Represents the used size of the Old area. (It's really hard to see)
--Used Young Size: (gray)
--Represents the usage size of the Young area. (It's really hard to see)If you continue to reach the size specified by -Xmx, you can see that Full GC occurs frequently. Also, as the heap size decreases after Full GC, the frequency of Full GC seems to tend to decrease.
However, there was a time when Full GC did not occur frequently and the upper limit size remained even when the upper limit size was stuck.
Instead of Total Heap, the Young area (Young Generation: orange) and Old area (Tenured Generation: pink) are displayed as the breakdown, and the memory usage status is analyzed in combination with Full GC.
When I searched Google and checked the JVM option to get the GC log, it was the following option, so I tried it.
{tomcat-root}/bin/setenv.sh
JAVA_OPTS="-server -Xms512M -Xmx2G -Dspring.profiles.active=production -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/gclog/gc.log"
After completing the above settings, when I restarted the tomcat service, the web application built with Tomcat stopped working. When I was investigating why it stopped working, I found the following error.
{tomcat-root}/logs/catalina.out
Unrecognized VM option 'PrintGCDateStamps'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
NOTE: Picked up JDK_JAVA_OPTIONS: --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAM
Apparently some JVM parameters are no longer available in Java 9 or later (then I can't start the JVM). Therefore, you need to replace the parameter that is no longer available with another parameter.
https://docs.oracle.com/javase/9/tools/java.htm#JSWOR624
Information on replacement parameters that can be used as a reference can be found in the Convert GC Logging Flags to Xlog
section of this site.
However, not all parameters are replaced on a one-to-one basis, which is a bit confusing.
Looking at the graph below, the memory usage has not dropped at around 1.4Gb for a while.
If you look at the thread status around 20:41 when the memory usage suddenly increased, you can see that the number of threads peaked at about the same time. However, since the number of thread executions reached the lower limit within a few minutes after that, it is speculated that temporary simultaneous frequent access may have occurred.
The fact that the heap memory usage has not improved at all even after the number of threads has dropped means that there may be objects that are not destroyed even if a GC occurs.
If you check the usage status such as the memory of the server at this time with the top
command, it looks like the following.
# top -p 923
top - 11:53:47 up 1:01, 1 user, load average: 0.00, 0.03, 0.05
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
top - 11:54:02 up 1:01, 1 user, load average: 0.00, 0.03, 0.05
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.3 hi, 0.3 si, 0.0 st
MiB Mem : 3939.0 total, 1342.6 free, 1657.3 used, 939.1 buff/cache
MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 2060.3 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
923 root 20 0 4278196 1.5g 31744 S 0.0 39.7 1:09.33 java
--VIRT Virtual Image. All reserved virtual memory. Includes swapped memory usage. (Unit: kb) ――Since it is about 4GB, you can see that the processing that had to use the memory up to 4GB including the swapped processing occurred for Tomcat that was executed by specifying 2GB with -Xmx. --RES The size of physical memory used, not swapped. --1.5GB I used physical memory without actually swapping.
If it's just a temporary focus, you don't need to review the -Xmx parameter, but if it happens frequently, you may need to increase the memory maximum.
Here is a graph that the memory usage status improved at once before 21:00.
Result of top command in this state
# top -p 923
top - 12:04:03 up 1:11, 1 user, load average: 0.09, 0.04, 0.01
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 3939.0 total, 1340.3 free, 1659.6 used, 939.1 buff/cache
MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 2058.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
923 root 20 0 4278196 1.5g 31744 S 0.3 39.8 1:12.06 java
In particular, the values of VIRT, RES, SHR, etc. did not change. From this, it can be seen that the top command can only see the maximum value of the memory that has jumped up temporarily as a snapshot.
Recommended Posts