Test in real application

There are 3 types of tests

Micro benchmark

As the name suggests, it tests in small units.

Used when comparing small implementation differences. Care must be taken when writing test code, and unless the code uses the processing results, the calculation processing will be deleted by the optimization of the compiler.

Whether the microbenchmark is multithreaded or singlethreaded, it's important to add volatile (not optimized, even when you don't want to cache it). Microbenchmarks often have a bottleneck in synchronization, which is rarely a problem in actual operation, and it takes time to resolve it.

↓ Reference material about volatile Declaring a Variable Volatile (Writing Device Drivers)

In the benchmark, when calling the synchronized method from multiple threads, synchronization becomes a bottleneck, so it should be noted that it is not a benchmark but a measurement of the time for the JVM to resolve the conflict.

By preparing the input value in advance and passing it, do not add extra processing to the benchmark code.

The test should be based on the correct input values. For example, we do not measure with extremely large numbers that cannot be done in actual operation.

As explained in Chapter 4, Java gets faster as the code is executed repeatedly, so warming up should be done.

Macro benchmark

The best way to test the performance of your application is to combine the external resources used. Such a test is called a macro benchmark.

Meso Benchmark

It's a real process, but it doesn't use a complete application, but it's a finer-grained test than a microbenchmark. For example, a measure of how fast a web server responds (not trying session or login features). Meso benchmarks are also suitable for automated testing.

Understand Throughput, Batch, Response Time Understand Throughput, Batch, Response Time

Measurement of batch processing

It measures from the start to the end of an application. Effective if it does not matter as a whole. On the contrary, if it becomes a problem when it is late before warming up, another measurement method is required. In the case of Java, warm-up is required, so it is not easy.

Throughput measurement

Measurement of how much processing can be performed within a certain period of time. In the case of client-server type, it is necessary to measure without think time (time to wait without doing anything). The number of requests per second is often used. TPS (transactions per second), RPS (requests per second), OPS (operations per second), etc.

Response time test

There are two methods for calculating the response time. One is the average value. The second is the percentile value. For example, when the percentile value is 90 percentile value of 1.5 seconds, the response time of 90% of requests is 1.5 seconds or less, and the remaining 10% is 1.5 seconds or more.

In this book, the load generator of ↓ was introduced. About Faban

Understand instability

The test changes over time. A good benchmark uses data with different randomness for each run that is close to the real world. When comparing test results, it's difficult to tell if there's a problem with the code or if it's a coincidence.

	Base line	Specimen
First time	1.0 seconds	0.5 seconds
Second time	0.8 seconds	1.25 seconds
Third time	1.2 seconds	0.5 seconds
Average value	1.0 seconds	0.75 seconds

In the case of the test result, the student's t-test has a 43% chance that the performance will be the same.

However, due to the uncertainty in the data, no amount of analysis can be made.

Test frequently from early on

It is important to test frequently, but it is necessary to balance it with the current situation.

Early and frequent testing is extremely beneficial if the following are true: Because it is easy to get hints for solutions.

--Automate everything --Measure everything --Run on target system

Java Performance Chapter 2 Performance Testing Approach