Friday, July 31, 2020

JVM optimization for java 8

Starting with Java 8, the Metaspace replaces the PermGen. 
So let's discuss JVM performance tuning in the context of JDK9+. 

Java 8 JVM model

The java 8 JVM model (in the light of Serial Garbage Collection model) looks like the following:

JVM


Heap is a dedicated memory space for the JVM objects.

The Heap is divided into 2 parts, Young Generation and Old Generation. 

Young Generation heap is further divided into Eden space and 2 survivor spaces. 

The metadata information of the JVM is stored in a native memory called "MetaSpace" (They are previously stored in PermGen space). This MetaSpace memory region is not a contiguous Java Heap memory. This space is used by JVM for Garbage collection, auto tuning, concurrent de-allocation of metadata.

The Young Generation heap stores the short lived java objects, while the Old Generation heap stores the long lived java objects. Most of the java objects are short lived. Iterator objects are example of short lived objects with lifespan of a single loop. Some objects live long, the object created in public static main() could live until the program exits.

A java object's life cycle starts with keyword new. It is referenced by other objects and it references other objects to get the work done. Once there is no reference to the object, it can be garbage collected. Once garbaged collected, the java object no longer exists in heap. The majority of the java objects dies in Eden space. A few java objects die in Old generation heap.

Garbage collection occurs in each generation heap when the generation fills up. 

When the Eden space fills up, it causes a minor garbage collection in which only a few live objects in young generation are saved into survivor space (no garbage collection happens in old generation heap). The costs of such collections are small, a young generation full of dead objects is garbage collected very quickly, because only few live objects are copied to survivor space, then the Eden space is marked clean. 

During each minor garbage collection, some fraction of the surviving objects from the young generation are moved to the old generation. Eventually, the old generation will fill up and must be collected, resulting in a major garbage collection, in which the entire heap is garbage collected. Major collections usually last much longer than minor collections because a significantly larger number of objects are involved.

The vassal from young generation to old generation are the two survivor spaces in young generation heap. Most objects are initially allocated in eden. One survivor space is empty at any time, and serves as the destination of any live objects in eden; the other survivor space is the destination during the next copying collection. Objects are copied between survivor spaces in this way until they are old enough to be tenured (copied to the old generation).

JVM performance tuning strategies


This is probably the most import picture of understanding JVM optimization. Now we can understand how these parameters control JVM performance:
  •  -XX:MinHeapFreeRatio=<minimum>
  • -XX:MaxHeapFreeRatio=<maximum>
  • -Xms<min>
  • -Xmx<max>
  • -XX:NewRation=3
  • -XX:SurvivorRatio=6

The above JVM model has a few basic aspects. 

First of all, the total heap size the JVM controls. This decide how much resources are allocated to this particular JVM. More memory the JVM has, more capable it is. The total heap size is bounded below by -Xms<min> and above by -Xmx<max>. JVM shrinks and expands heap size between -Xms and -Xmx, reserve some memory to itself in case Young Generation or Old Generation need reinforcement.  JVM decides when to re-enforce a generation according to the ratio of free space to live objects. This target range is set as a percentage by the parameters -XX:MinHeapFreeRatio=<minimum> and -XX:MaxHeapFreeRatio=<maximum>. For example, for range 40% to 70%, if the percent of free space in a generation falls below 40%, then JVM will use the reserved memory to expand the generation to maintain 40% free space. Similarly, if the free space of a generation exceeds 70%, then the JVM will take some space away from the generation as JVM reservation. Setting -Xms and -Xmx to the same value increases predictability, however, the JVM is then unable to compensate in bad times. 

The second most important aspects of JVM model is the ratio of heap dedicated to the young generation and old generation. The bigger a generation, the less often garbage collections occur. XX:NewRatio controls this aspects. For example, setting -XX:NewRatio=3 means that the ratio between the young and old generation is 1:3. In other words, the combined size of the eden and survivor spaces will be one-fourth of the total heap size. Increase the ratio, for example, means bigger young generation and smaller old generation, which implies less minor GC and more full GC. 

Guess what does -XX:SurvivorRatio=6 mean? It sets the ratio between eden and a survivor space to 1:6. In other words, each survivor space will be one-sixth the size of eden, and thus one-eighth the size of the young generation (not one-seventh, because there are two survivor spaces). If survivor spaces are too small, copying collection overflows directly into the old generation. If survivor spaces are too large, they will be uselessly empty. 

The above generation based garbage collection model and parameters are the essence of JVM performance tuning. 

JVM Garbage Collector Options


The rest of this article covers the techniques JVM used to augment this generation based GC model with multi-cores.

Actually, JVM has 3 modes for generation based garbage collection. They can be selected with one of the following flags:

  1. -XX:+UseSerialGC

  2. -XX:+UseParallelGC

  3. -XX:+UseG1GC

The XX:+UseConcMarkSweepGC is not in this list, because it is replaced by -XX:+UseG1GC in JDK9 and now is practically obsolete. 

We can skip -XX:+UseSerialGC, all we have to know is, it tells JVM to use Serial Garbage Collector, which is what we have talked so far.

Parallel Garbage Collector


-XX:+UseParallelGC tells JVM to use Parallel Garbage Collector mode. While the serial garbage collector uses a single thread to perform all garbage collection work, the parallel garbage collector, by default, execute both minor and major collections with multiple threads. ParallelGC usually performs significantly better than the serialGC when more than two processors are available. The number of garbage collector threads can be controlled with the command-line option -XX:ParallelGCThreads=<N>. Because multiple garbage collector threads are participating in a minor collection, some fragmentation is possible due to promotions from the young generation to the old generation during the collection. Each garbage collection thread involved in a minor collection reserves a part of the old generation for promotions and the division of the available space into these "promotion buffers" can cause a fragmentation effect. Reducing the number of garbage collector threads and increasing the size of the old generation will reduce this fragmentation effect.

The ParallelGC allows automatic tuning by specifying specific behaviors instead of generation sizes and other low-level tuning details. 

-XX:MaxGCPauseMillis=<N> hints JVM that pause times of <N> milliseconds or less are desired.
Put emphasis on this goal could reduce the overall throughput of the application and the desired pause time goal may not be met.

-XX:GCTimeRatio=<N> sets the ratio of garbage collection time to application running time to 1/(1+<N>). This flag hints JVM to put emphasis on meeting the throughput goal. Again, this goal can harm the pause goal and may not be achieved.  

The ParallelGC has an implicit goal of minimizing the size of the heap set by -Xmx<N> as long as the other goals are being met.

There are other flags to control other aspects of parallel garbage collector. Generation size adjustments, for example, are controlled by 
XX:YoungGenerationSizeIncrement=<Y>  
-XX:TenuredGenerationSizeIncrement=<T> 
-XX:AdaptiveSizeDecrementScaleFactor=<D>

G1 Garbage Collector

G1 Garbage Collector is the default GC in JDK9. It has the shortest pause time among the 3 GC options.

  1. SerialGC has a smallest GC overhead, if the application has a small data set (up to approximately 100 MB), or it is running on a single core and there are no pause time requirements, then option -XX:+UseSerialGC is the best choice. 
  2. If peak application performance is the first priority and (b) there are no pause time requirements or pauses of 1 second or longer are acceptable, then select the parallel collector with -XX:+UseParallelGC. 
  3. If response time is more important than overall throughput and garbage collection pauses must be kept shorter than approximately 1 second, then select the concurrent collector with -XX:+UseG1GC.
Now let's study how G1 Garbage collector works and how it achieve short garbage collection pauses.

Recall that SerialGC divides the heap into 3 regions: eden space, survivor space and old generation space. SerialGC perform many minor garbage collections in eden space. After each minor GC, it saves the survivor objects into survivor space. Eventually fraction of the survivor space objects are saved into old generation space. Once the old generation space is slowly filled up, a stop of the world full garbage collection is performed across the whole heap.

G1GC also divide the heap into not 3 regions but 3 types of regions: eden regions, survivor regions and old generation regions. (4 types of regions if you like, 3 types plus un-allocated empty slots.) The region size depends on the heap size, 2000 regions (or empty slots) per heap is quite typical.

Young Collections


Young Collections

At the beginning, most of the regions are un-allocated empty slots, JVM starts to allocate some regions and create new objects in them, we call those regions eden regions. The short lived java objects quickly born, die and fill up these initially empty eden regions. After a certain number of eden regions are allocated, the JVM will perform minor garbage collections to copy survivor objects to a small number of allocated regions which we call survivor regions. The few objects survived many minor garbage collections, thus are copied from survivor regions into some allocated regions, which we call old generation regions. These process happened in many regions in the heap concurrently.

Young Collection + Concurrent Mark

As the heap ages, after numerous minor garbage collections, more and more survivors entered old generation regions. Eventually, the original empty heap are now full of old generation regions. It is time for the JVM to prepare the whole heap clean up. In the phase of Young Collection + Concurrent Mark, JVM scans the old generation regions in the heap to mark the live objects in them, concurrently. While the JVM perform the concurrent live objects mark, the application is running, the minor young garbage collection is continuing. The world does not stop, it is running as usual, maybe just a little bit slower than the Young collections phase.

Mixed Collections

Young Collection + Concurrent Mark is just a short phase, in no time, JVM finished marking the live objects in the old generation regions, the heap now entered the final phase of a heap life cycle -- mixed collections.

Mixed Collections


During this phase, the normal Eden regions -> survivor regions and survivor regions -> old generation regions copies continue as usual. However these normal minor Young generation GCs are now mixed with whole heap old generation GCs. Additionally, JVM compactly copies the live objects in most of the old generation regions into a small number of old generation regions, thus put more and more old generation regions back to empty slot. 

This process continues until most of the old generation regions are returned back to empty slot, the few old generation regions left are full of live objects and not worth further garbage collecting.

The heap is now reborn, JVM stops Mixed Collections phase, and starts Young Collections phase, the beginning of a new cycle. 

As you can see, in the 3 phases of heap life cycle, the regions are concurrently updated, there are always regions for creating new objects and there is no stop the world events happening. That is the reason G1GC has the lowest pause time among the 3 JVM GC options. 

No comments:

Post a Comment

meta.ai impression

Meta.ai is released by meta yesterday, it is super fast you can generate image while typing! You can ask meta.ai to draw a cat with curvy fu...