Making Garbage Collection faster

Short of avoiding garbage collection altogether, there is only one way to make garbage collection faster: ensure that as few objects as possible are reachable during the garbage collection. The fewer objects that are alive, the less there is to be marked. This is the rationale behind the generational heap.

The Generation Conflict – Young vs. Old

In a typical application most objects are very short-lived. On the other hand, some objects last for a very long time and even until the application is terminated. When using generational garbage collection, the heap area is divided into two areas—a young generation and an old generation—that are garbage-collected via separate strategies.

Objects are ussually created in the young area. Once an object has survived a couple of GC cycles it is tenured to the old generation. (The JRockit and IBM JVM make exceptions for very large objects, as I will explain later.) After the application has completed its initial startup phase (most applications allocate caches, pools, and other permanent objects during startup), most allocated objects will not survive their first or second GC cycle. The number of live objects that need to be considered in each cycle should be stable and relatively small.

Allocations in the old generation should be infrequent, and in an ideal world would not happen at all after the initial startup phase. If the old generation is not growing and therefore not running out of space, it requires no garbage-collection at all. There will be unreachable objects in the old generation, but as long as the memory is not needed, there is no reason to reclaim it.

To make this generational approach work, the young generation must be big enough to ensure that all temporary objects die there. Since the number of temporary objects in most applications depends on the current application load, the optimal young generation size is load-related. Therefore, sizing the young generation, known as generation-sizing, is the key to achieving peak load.

Unfortunately, it is often not possible to reach an optimal state where all objects die in the young generation, and so the old generation will often often require a concurrent garbage collector. Concurrent garbage collection together with a minimally growing old generation ensures that the unavoidable, stop-the-world events will at least be very short and predictable.

On the other hand, while there is a very high number of allocations in the young generation at the beginning of each GC cycle, there is only a small portion of objects alive after each GC cycle. This leads to a high level of fragmentation using the GC strategies we have discussed so far. You might think that using free lists would be a good option, but this will slow down allocations. An alternative strategy of executing a full compaction every time has a negative effect on pause time. Instead, most JVMs implement a strategy in the young generation, known as copy collection.

When Copying is Faster than Marking

Copy garbage collection divides the heap into two (or more) areas, only one of which is used for allocations. When this area is full, all live objects are copied to the second area, and then the first area is simply declared empty (see Figure 2.7).

Instead of sweeping away garbage and compacting the heap, the copy collection simply copies live objects someplace else and declares the old region empty

Figure 2.7: Instead of sweeping away garbage and compacting the heap, the copy collection simply copies live objects someplace else and declares the old region empty.

The advantage here is that no fragmentation occurs, and thus there is no need for free lists and compaction. Allocations are always fast and the GC algorithm is simple. This strategy is effective only if most objects die, which is the default in the young generation. However, this can lead to a problem when the JVM is executing a high number of concurrent transactions.

If the young generation is too small, objects are tenured prematurely to the old generation. If the young generation is too large, too many objects are alive (undead) and the GC cycle will take too long. Contrary to what most people think, these young-generation GCs, often termed minor GCs, are full of stop-the-world events. Their negative impact on response time can be more severe than the occasional old-generation GC.

Not surprisingly, the generational heap does not provide a silver-bullet solution to the garbage-collection problem. The optimal configuration is often a compromise, with an adequately sized young generation to avoid overly long minor GCs, and a concurrent GC in the old generation to deal with prematurely tenured objects.

The Case for a Non-generational Heap

The Oracle HotSpot JVM uses a generational heap exclusively, while Oracle JRockit also supports a non-generational heap, and IBM WebSphere defaults to a non-generational heap and recommends that JVMs smaller than 100 MB always use a non-generational heap. In terms of both CPU and memory, a generational GC and the associated copy collection have a certain overhead, which makes sense.

If your application is designed for pure throughput and the number of temporary objects is relatively small, a non-generational GC has advantages. A full parallel GC has a better tradeoff in terms of CPU usage if you don't care about the response time of a single transaction. On the other hand, if the number of temporary objects is relative small, a concurrent GC in a non-generational heap might do the job and with less suspension time than a generational GC. Only a thorough performance test can determine this for sure.

Java Tech Discussion

Pages

Tuesday, 19 August 2014