__ This article is a post from Microsoft Azure Tech Advent Calendar 2020. __
I think that troubleshooting is inseparable when operating a website. Therefore, it is important to create a site that is easy to troubleshoot, considering the operation. In Web Apps, you can enter commands or open Process Explorer from the web console by launching kudu from the "advanced tools" blade. I think that by utilizing these, you can operate the site with easy troubleshooting.
In this article, I would like to introduce some useful troubleshooting preparations and methods for operating Java Web Apps. We will target the Windows environment of Java Web Apps, but we hope that you can use the method introduced here as it can also be applied to the Linux environment.
So, let's use the Java app creation procedure in the quick start of the official documentation below.
Quick Start: Create a Java App in Azure App Service https://docs.microsoft.com/ja-jp/azure/app-service/quickstart-java?tabs=javase&pivots=platform-windows
Clone the repository with the following command and select Windows, Java11, Java SE in config
of azure-webapp-maven-plugin
.
git clone https://github.com/spring-guides/gs-spring-boot
cd gs-spring-boot/complete
.\mvnw com.microsoft.azure:azure-webapp-maven-plugin:1.12.0:config
Build & deploy to Web Apps with the following command.
az login
.\mvnw package azure-webapp:deploy
Go to the website.
I was able to prepare the site with just this much effort.
If the stack trace appears in the application log, it is relatively easy to analyze because you can narrow down the cause of the error.
On the other hand, when performance is temporarily degraded or Out of Memory (OoM) occurs, it is extremely difficult to identify the cause without information on the time when the event occurred. I think that it is also another programming language, but Java has a function to store profiling data in memory and output a file manually or at the end (JFR: Java Flight Recorder), and a function to dump heap at the time of Out of Memory. You can get the information at the time when Please note that the profiling data stored in the memory will be lost if the specified memory size limit is exceeded.
You can add Java startup arguments from JAVA_OPTS in your Web Apps application settings. The setting value of JAVA_OPTS that enables JFR and executes heap dump at the time of OoM is as follows.
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=C:\home\LogFiles\catalina.hprof -XX:StartFlightRecording=dumponexit=true
Start kudu from the "advanced tools" blade, issue jcmd with Debug Console
as shown in the screenshot below, and see if the above startup arguments are actually set.
You can use jcmd to get various information about the java process, so you can use it for troubleshooting.
You can also find the PID of the java process from Process explore
.
Now that the settings have been completed successfully, let's forcibly cause trouble and analyze it.
Add the following class to the sample app and deploy it again.
If you keep accessing / oom
, will it be safe? OoM occurs.
package com.example.springboot;
import java.util.List;
import java.util.ArrayList;
import java.util.Arrays;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.bind.annotation.RequestMapping;
@RestController
public class OoMController {
private static final List<String> POOL = new ArrayList<>();
@RequestMapping("/oom")
public String feed() {
char[] tmp = new char[200*1024*1024];
Arrays.fill(tmp, 'a');
POOL.add(new String(tmp));
return "Feed Me!";
}
}
Thanks to the above settings, a heap dump for OoM exists in C: \ home \ LogFiles \ catalina.hprof
as below.
Press the download button on catalina.hprof
to get the heap dump file.
Let's analyze the heap dump using tools such as Zulu Mission Control and VisualVM. I like VisualVM, so I want to parse it with VisualVM.
If you open catalina.hprof
, you'll see that there is a class that is obviously consuming a lot of heap with Dominators by Retained Size
. Right-click and open in New Tab.
As you can see in the image below, you can see the fields of the ArrayList instance that are using a lot of heap and the class (OoMController) that references it.
Now that you've identified the problem, it's time to modify the source to avoid using memory.
Even when OoM occurs in this way, heap dumps allow you to smoothly troubleshoot the heap.
If you don't get the performance, you can use JFR to find out what is the bottleneck.
Add the following class to the sample app and force the problem as you did with OoM.
Accessing / sleep
is unpleasantly slow to respond.
package com.example.springboot;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.bind.annotation.RequestMapping;
@RestController
public class SleepController {
@RequestMapping("/sleep")
public String sleep() {
try {
Thread.currentThread().sleep(10 * 1000);
} catch (InterruptedException ex) {
// Nothing to do.
}
return "I'm not lazy..";
}
}
Dump the JFR data in memory from kudu with the following command, and download the JFR file as you did during the heap dump.
Let's analyze the profiling data with VisualVM.
After opening the JFR file, display the Sampler tab and press the CPU button.
Filter the Tomcat thread with the thread icon like http-nio-
In this way, JFR data can be used to find bottleneck processes.
Flight recorder information and heap dumps come in handy when your application is slow or suddenly goes down. It's just a simple sample, but I'd be happy if the troubleshooting preparations and methods I've introduced here can help you in serious situations.