O'Reilly Japan -Java Performance Summary of Chapter 2 of this book
Chapter 1 Introduction -Qiita ← Previous article Chapter 2 Performance Testing Approach-Qiita ← This Article Chapter 3 Java Performance Toolbox -Qiita ← Next article Chapter 4 How the JIT compiler works -Qiita Chapter 5 Basics of Garbage Collection -Qiita
There are 3 types of tests
As the name suggests, it tests in small units.
Used when comparing small implementation differences. Care must be taken when writing test code, and unless the code uses the processing results, the calculation processing will be deleted by the optimization of the compiler.
Whether the microbenchmark is multithreaded or singlethreaded, it's important to add volatile
(not optimized, even when you don't want to cache it).
Microbenchmarks often have a bottleneck in synchronization, which is rarely a problem in actual operation, and it takes time to resolve it.
↓ Reference material about volatile Declaring a Variable Volatile (Writing Device Drivers)
In the benchmark, when calling the synchronized
method from multiple threads, synchronization becomes a bottleneck, so it should be noted that it is not a benchmark but a measurement of the time for the JVM to resolve the conflict.
By preparing the input value in advance and passing it, do not add extra processing to the benchmark code.
The test should be based on the correct input values. For example, we do not measure with extremely large numbers that cannot be done in actual operation.
As explained in Chapter 4, Java gets faster as the code is executed repeatedly, so warming up should be done.
The best way to test the performance of your application is to combine the external resources used. Such a test is called a macro benchmark.
It's a real process, but it doesn't use a complete application, but it's a finer-grained test than a microbenchmark. For example, a measure of how fast a web server responds (not trying session or login features). Meso benchmarks are also suitable for automated testing.
It measures from the start to the end of an application. Effective if it does not matter as a whole. On the contrary, if it becomes a problem when it is late before warming up, another measurement method is required. In the case of Java, warm-up is required, so it is not easy.
Measurement of how much processing can be performed within a certain period of time. In the case of client-server type, it is necessary to measure without think time (time to wait without doing anything). The number of requests per second is often used. TPS (transactions per second), RPS (requests per second), OPS (operations per second), etc.
There are two methods for calculating the response time. One is the average value. The second is the percentile value. For example, when the percentile value is 90 percentile value of 1.5 seconds, the response time of 90% of requests is 1.5 seconds or less, and the remaining 10% is 1.5 seconds or more.
In this book, the load generator of ↓ was introduced. About Faban
The test changes over time. A good benchmark uses data with different randomness for each run that is close to the real world. When comparing test results, it's difficult to tell if there's a problem with the code or if it's a coincidence.
Base line | Specimen | |
---|---|---|
First time | 1.0 seconds | 0.5 seconds |
Second time | 0.8 seconds | 1.25 seconds |
Third time | 1.2 seconds | 0.5 seconds |
Average value | 1.0 seconds | 0.75 seconds |
In the case of the test result, the student's t-test has a 43% chance that the performance will be the same.
However, due to the uncertainty in the data, no amount of analysis can be made.
It is important to test frequently, but it is necessary to balance it with the current situation.
Early and frequent testing is extremely beneficial if the following are true: Because it is easy to get hints for solutions.
--Automate everything --Measure everything --Run on target system
Recommended Posts