Java Performance Chapter 5 Garbage Collection Basics

O'Reilly Japan -Java Performance Summary of Chapter 5 of this book

Chapter 1 Introduction -Qiita Chapter 2 Performance Testing Approach -Qiita Chapter 3 Java Performance Toolbox -Qiita Chapter 4 Mechanism of JIT compiler -Qiita ← Previous article

5.1 Garbage Collection Overview

The appeal of Java is that developers don't have to be aware of the life cycle of objects. However, this may be a weakness when optimizing memory usage. According to the experience of the author of this book, the time spent working on Java memory is shorter than the time spent crushing bugs related to dangling pointers and null pointers in other languages.

If it is not referenced from anywhere, it will be subject to GC. However, even if there is a list with a structure in which an element refers to the next element, such as a linked list, the entire list can be released if the list itself is not referenced.

It periodically searches the heap for unused objects. You need to combine memory areas somewhere to prevent memory fragmentation.

Garbage collection performance is determined by the following three factors.

--Discovery of unused objects --Free memory --Compact heap

Application threads need to keep these objects unused when looking up object references or moving objects in memory. Stopping an application thread is called stop-the-world pause.

Generation-based garbage collector

The heap area is divided into the following areas.

--old area (also tenured) --young area. It can be further divided into the following two --eden (paradise) --survivor

The reason why it is divided into multiple objects is that most objects are used only for a short period of time.

The object is first assigned to eden in the young area.

For minor garbage collection, when this area is full, the garbage collector will stop all application threads, move used objects to another area, and empty the young area (unused objects will disappear). ).

There are two advantages to this method:

--Fast because it is only the young area that is a part of the heap area. * If the young area is small, minor GCs run frequently. --Move to either eden space or old area. There is no need for compactification because all objects are eliminated.

Repeatedly moving to the old area fills the old area. The processing of the old area depends on the garbage collection algorithm.

The process is complicated, but CMS and G1 can search for unused objects without stopping the application thread. However, it consumes more CPU cycles. These garbage collections may also be full garbage collections. When tuning concurrent garbage collection, avoiding full garbage collection is one of the major goals.

Guidelines for which garbage collector to use

If response time is important Stopping threads (especially full garbage collection) has some impact on requests, but if you want to reduce that, concurrent garbage collection Throughput-type garbage collector if average response time is important

If it is a batch processing type, Concurrent garbage collection if CPU resources are available. You can avoid the outage caused by full garbage collection and finish the process faster. If CPU resources are limited, it will be even slower.

Garbage collection algorithm

There are four

1. Serial garbage collector

The simplest. There is only one thread that processes the heap. Stop the application thread for both minor and full garbage collection. Can be enabled with -XX: + UseSerialGC.

2. Throughput type garbage collector

Minor garbage collection is faster than serial type because it uses multiple threads to process the young area. To activate -XX:+UseParallelGC -XX: + UseParallelOldGC (Use multiple threads to process old area) And.

3. CMS Garbage Collector

CMS (Concurrent Mark Sweep) was designed to avoid long outages associated with full garbage collection. Application threads only stop during minor garbage collection. It consumes CPU resources instead. Background threads do not compact, so the heap remains fragmented. If there are not enough CPU resources, or if heap fragmentation progresses and objects cannot be allocated, the same operation as the serial type is performed. To activate -XX:+UseConcMarkSweepGC -XX:+UseParNewGC And.

4. G1 Garbage Collector

G1 (Garbage 1st) aims to process a large heap (4GB or more) with a minimum downtime. The heap is divided into multiple regions for processing. Copy objects used within one region to another region. Then, the theory is that it is automatically (partially) compacted. Like CMS, it avoids full garbage collection, so it consumes a lot of CPU resources. To activate -XX:+UseG1GC And.

Forced execution and disabling of garbage collection

Minor garbage collection occurs when the young area is full. Full garbage collection occurs when the old area is full. Concurrent garbage collection occurs when the heap is about to fill up.

There is little good to force garbage collection with System.gc (). Full garbage collection is always done even if CMS or G1 garbage collection is used. Just moving the garbage collection that will occur ahead of schedule will not improve performance. However, it makes sense when it comes to performance monitoring and benchmark measurements. Garbage collection can be performed before the measurement, and the measurement can be performed in a clean state. Garbage collection before the output of the heap dump also has the advantage of making it easier to analyze the heap (although in most cases most heap dump acquisition methods are automatically garbage collected).

Garbage collection can be generated with jcmd $ {process ID} GC.run. You can also do it with jconsole.

In RMI (Remote Method Invocation), System.gc () is executed every hour as a mechanism of distributed garbage collector.

If you want to disable the call to System.gc (), you can do it with -XX: + DisableExplicitGC.

Garbage collection algorithm selection

The serial garbage collector is effective only when the memory used is 100 Mbyte or less. For most programs, the throughput type or the concurrent type is selected.

Decide which one to use according to the performance goals described in Chapter 2. Consider your application's run time, throughput, or mean (or 90th percentile) priority.

Concurrent garbage collection is suitable for batch processing that does not consume all CPU resources. Throughput-type garbage collection is suitable for batch processing that consumes all CPU resources.

CPU resources are limited, and CMS background threads cannot be run, which can significantly reduce CMS performance. This state is called concurrent mode failure.

Throughput-type garbage collection has a higher average response time, At the 90th percentile value, CMS may be higher.

Basically, CMS is faster than G1 when the heap size is 4Gbyte or less.

The CMS background thread scans the entire old area before freeing the object. The time to scan is proportional to the size of the area. If the heap fills up before scanning and freeing objects, concurrent mode failure occurs. When it happens, stop all application threads and do full garbage collection. Performance suffers because only one thread is used for full garbage collection. It is possible to tune to use multiple threads here, but the processing of each thread will increase accordingly. The probability of concurrent mode failure also affects the amount of allocated memory.

On the other hand, in the G1 garbage collector, the old area is divided by region, so it is possible to process the scan of the old area in a separate thread for each region. Even if the processing of these threads cannot keep up, concurrent mode failure may occur, but due to the mechanism, it does not occur very often.

In CMS, it is easy to be fragmented because it is not compact except for full garbage collection.

Heap fragmentation is unlikely to occur in G1.

There are tuning methods in both CMS and G1 to avoid concurrent mode failure.

5.2 Basic tuning of garbage collector

Change heap size

If the heap is small, you will have to do garbage collection all the time. The pause time that occurs during garbage collection increases in proportion to the heap size.

If you specify a heap area larger than the physical memory ... The JVM does not distinguish whether it is swap space or not. Therefore, the Java application tries to use the specified heap area to the full, which affects the performance. Furthermore, during full garbage collection, the entire heap is accessed, so swapping always occurs. Then, garbage collection is not in time, leading to concurrent mode failure. So ** the heap size specification should not exceed the amount of physical memory. ** ** Physical memory should have a margin of 1 Gbyte for the JVM itself and other applications.

The heap size is specified with an initial value of -XmsN and a maximum value of -XmxN. The default value differs depending on the OS, memory capacity, and JVM type. The JVM automatically tunes between the initial and maximum values. If garbage collection occurs too much with the current heap size, continuously increase the heap size.

In general, it is desirable to have a heap size that is 30% used after full garbage collection.

Size setting of each area

---XX: NewRatio = N: Ratio of old area to young area (default is 2) ---XX: NewSize = N: young Initial size of area ---XX: MaxNewSize = N: The maximum size of the young area ---XmnN: Short notation to make NewSize and MaxNewSize the same value

The initial size of the young region is calculated below.

Initial size of young area=Initial size of heap/(1 + NewRatio)

By default, the initial size of the young area is 33% of the initial size of the heap (you can also specify it with the NewSize flag).

Resize permanent area and metaspace

The JVM holds data about the class. The area is called the Permanent area up to Java 7, and is called the metaspace after Java 8. The Permanent area of Java7 contained miscellaneous information that had nothing to do with the class data, but from Java8 it was moved to the normal heap.

The permanent area and metaspace are areas used by the JIT compiler and JVM, so they are hardly conscious of them.

The permanent area has an upper limit, but the metaspace has no upper limit by default (there is almost no need to specify the upper limit size), If you put it on Permanent area settings: -XX: PermSize = N, -XX: MaxPermSize = N Metaspace settings: -XX: MetaspaceSize = N, -XX: MaxMetaspaceSize = N

There are cases where the memory is exhausted in the meta space, but it can be analyzed by NMT (Native Memory Tracking) introduced in Chapter 8. (Is it the case when the system becomes too huge and the number of classes increases? If that happens, it seems better to think about service division before that happens)

Resizing permanent areas and metaspaces is slow because it involves garbage collection. If garbage collection is repeated at startup, the permanent area and metaspace may have been expanded. In such a case, increase the initial size.

You can get the class loader information by using the heap dump explained in Chapter 7. You can see if the data from the class loader does not fill the permanent area or metaspace. If you specify -clstats ( -permstat for Java 7) and start jmap, you can get information about the class loader.

Specifying the degree of parallelism

Other than the serial type, garbage collection is performed by multithreading. You can specify the number of threads with -XX: ParallelGCThreads = N.

By default, there is one thread per CPU, but if it exceeds 8, the number of threads is determined by the following formula.

Garbage collection threads= 8 + 5(N - 8)/8

If multiple JVMs are running, you should reduce the number of threads.

Garbage collection is highly efficient, so use 100% CPU.

adaptive sizing The size of the heap area and survivor space is dynamically changed during execution. This movement is called adaptive sizing. The advantage is that even if you set a large value for the maximum value, it will be automatically expanded without the situation where you may overuse the heap even though you do not use it.

Resizing takes time. Most of them are stopped by garbage collection. If you specify the garbage collection parameters in detail and know the required heap size, you can disable adaptive sizing. It can be disabled with -XX: -UseAdaptiveSizePolicy.

If you use -XX: + PrintAdaptiveSizePolicy, you can see how each space is expanded.

Garbage collection related tools

It's a good idea to look at the garbage collection log to see how much garbage collection affects your application. Use -verbose: gc or -XX: + PrintGC to get the GC log. -XX: + PrintGCDetails will give you a more detailed log. Also, specify -XX: + PrintGCTimeStamps or -XX: + PrintGCDateStamps. You can change the output destination with -Xloggc: $ {filename}. The flags related to log rotation are as follows. -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=N, -XX:GCLogFileSize=N You can generate graphs and tables by loading a log file into a tool called GC Histogram.

jstat -gcutil ${Process ID} 1000

You can get the GC log for the Java application started in.

Recommended Posts

Java Performance Chapter 5 Garbage Collection Basics
Java Performance Chapter 1 Introduction
Java Performance Chapter 3 Java Performance Toolbox
Java Performance Chapter 2 Performance Testing Approach
☾ Java / Collection
Java basics
Java basics
Java basics
Effective Java Chapter 2
java programming basics
Effective Java Chapter 6 34-35
Java Reintroduction-Java Collection
Java JAR basics
Object-oriented (Java) basics
Java concurrency basics
Garbage Collection -Part 1-
Effective Java Chapter 4 15-22
Effective Java Chapter 3
[Java] Collection framework
Java Performance Chapter 4 How the JIT Compiler Works
Expired collection of java
Java collection interview questions
Java app performance tuning
Java programming basics practice-array
Java Network Basics (Communication)
Muscle Java Basics Day 1
Chapter 2 Network programming with JAVA phttpd Exception collection in 3 places
Getting Started with Java Collection
I started Java Gold (Chapter 1-1)
Basics of character operation (java)
Java parallelization code sample collection
G1 Garbage Collection in 3 Minutes
[Java] Comparator of Collection class
Java test code method collection
Java programming basics practice-for statement
What is a Java collection?
Summary of Java language basics
Getting Started with Java Basics
Java Development Basics ~ Exercise (Array) ~
[Java basics] What is Class?
Deep copy collection in Java
Collection of programming selection tasks to make and remember (Java basics)