[Java] Try Health Check on Azure App Service.

4 minute read

Introduction

Hello.

Are you using App Service?

It is a typical PaaS that operates only by deploying a web application, but when it is operated, problems will occur in terms of recoverability. With App Service deployed, the health check for the application is not enabled, so even if the application does not respond normally, the internal load balancer will continue to distribute requests to the abnormal application. When such a situation occurs, it may cause troubles such as problems with service health monitoring and requests of a specific user being continuously distributed to abnormal instances.

Therefore, I will try “Health Check” this time. Although it is a preview, you can immediately try it on the Azure environment where you can deploy now, so I will verify this function.

Documentation here: https://github.com/projectkudu/kudu/wiki/Health-Check-(Preview)

Preparing for # Health Check Health Check is a function that checks the HTTP health for each instance of the application and disconnects the instance if the response is incorrect. Currently, it is not possible to set the detection threshold, but if HTTP Ping does not succeed for 5 times, it is excluded from the request distribution destination and the abnormal instance is automatically restarted. Will behave. The success or failure of HTTP Ping is judged based on whether a GET request is sent to a specific URI and there is a response with a 200-series status code within 2 minutes.

To use this feature, you need to directly edit the target App Service resource in Azure Resource Explorer (https://resources.azure.com/ ). (Currently, it’s a little fun to use the preview tools to take advantage of the preview features)

Access Resource Explorer, and select “subscriptions”->”Target subscriptions”->”resourceGroups”->”Target resource groups”->”providers”->”Microsoft.Web”->”sites”->” Expand the target AppService”->”config” and the tree. Enter Edit mode of config resource and rewrite “healthCheckPath” below. It is null by default, and the Health Check function is enabled by defining an arbitrary path here. image.png This time, I used /status.

Next, implement the URI corresponding to /status on the application side. This is an implementation of REST API that checks whether the application is operating normally and returns a 200 status code if there is no problem. The contents differ for each app, but generally, confirm the communication with the external service (Storage Blob, SQL Database, etc.) used by the app, and if it is normal, say OK Implement the API. Since this is a verification, I implemented a Spring Boot app that allows you to specify the status code that will be the return value in PUT. The sample code is as follows.

StatucController.java


package com.example.springboot;
import java.net.InetAddress;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PutMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;

@RestController
@RequestMapping(value = "/status")
public Class StatusController {
    Private staticHttpStatusstatus=HttpStatus.OK;

@GetMapping
    Public ResponseEntity<String> get() {
         Return new ResponseEntity<String>(this.getLocalhost() + "/" + status.getReasonPhrase(), status);
{}

@PutMapping
    Public ResponseEntity<String> put(@RequestParam(name="code", defaultValue = "200") Integer code) {
        Switch (code.intValue()) {
             Case 200:
                Status = HttpStatus.OK;
                 break;
             Case 500:
                Status = HttpStatus.INTERNAL_SERVER_ERROR;
                 break;
             Case 400:
                Status = HttpStatus.BAD_REQUEST;
                 break;
Case 502:
                Status = HttpStatus.BAD_GATEWAY;
                 break;
             default:
                Status = HttpStatus.OK;
                 break;
         }
         Return new ResponseEntity<String>(this.getLocalhost() + "/" + status.getReasonPhrase(), status);
{}

    Private StringString getLocalhost() {
        Try  {
             return InetAddress.getLocalHost().getHostAddress();
           catch(Exception e) {
             e.printStackTrace();
             Return  "0.0.0.0";
         }
{}
}

Operation check

After deploying the app to App Service, access it with Chrome. image.png It is a naive application, but the IP address (172.16.1.5) is displayed. This IP address will be the internal IP of each instance of App Service, so it will be the information to identify which instance this session is connected to.

This time I tried to access it from Edge. image.png Only the address is different from the one displayed in Chrome. App Service’s internal load balancer works for Stiky, so even if you have multiple backend instances, it’s basically going to connect to the same instance each time. (By the way, the weighting of load balancer distribution seems to be related to the amount of access, and Chrome reloaded about 10 times to connect to different instances.)

Now use Postman to access /status. image.png I was able to confirm that it was returned with a 200 status code. This connects to an instance of 172.16.1.5. Now try changing the response from /status to return a 500 status code. image.png In this verification application, by dynamically setting the HTTP status code returned from /status, it is expected that an abnormality of the application will be detected and the abnormal instance will be eliminated and restarted on the App Service side. I am.

Wait about 10 minutes and try updating Chrome. image.png The IP has changed to 172.16.1.2 and it looks like the instance was rebooted. Just in case, I will check with Postman. image.png Similarly, the access destination has changed to 172.16.1.2. From this, it can be confirmed that the Health Check function has been activated, the instance determined to be abnormal has been disconnected, and the instance has been restarted.

I will also check on the Azure portal side.image.png The orange line is the instance that I intentionally changed so that the response to HTTP Ping becomes abnormal. From the metric, you can see that the instance wasn’t responding from 14:59 to 15:14. After that, the service is restored because it was automatically restarted.

The Health Check function needs to be implemented on the application side, but it is an effective function to improve the resilience of services. This time we only confirmed the basic behavior, but I would like to spend some time checking the detailed behavior again.

That’s all for trying Health Check with App Service.