Dynamic G1 barrier elision for C2 in young

Erik Österlund erik.osterlund at lnu.se
Sat Jun 6 00:44:31 UTC 2015

Hi guys,

Making G1 run faster on GC-tuned applications that are designed to only
rarely spill objects into old, seems like an interesting and important
optimization goal at the moment.

Today I tried an interesting experiment. I sample garbage during the
sweeping phase (phase 2) of System.gc() (G1MarkSweep) that stumbles
through garbage anyway, hoping to find classes with instances that are
used all the time, but /never/ make it into old. Then I deoptimize these
classes and recompile the relevant nmethods depending on the class to
elide the G1 write barriers (in C2). If the GC eventually needs to promote
any of these objects to old, I just deoptimize again and recompile with G1
barriers turned back on.

On some DaCapo benchmarks, it payed off very well for a few benchmarks
that supposedly use many temporary objects:
fop: -9.2% time <- this one was brutal!!
xalan: -6.9% time
jython: -5.9% time

Results were measured with 40 warmup iterations, and then computed the
average of the following 10 iterations, so 50 iterations in total. Class
unloading was turned off (using my own patch to make -Xnoclassgc work,
because it seems to be broken currently) and 512M heaps.

The G1 barriers are already optimized to be faster for young objects, but
if the GC finds out that certain types of objects /never/ get old, telling
the compiler so allows complete elision of both the pre and post barriers
from the code which is nice.

Are we conceptually interested in such a solution, potentially accompanied
with a flag like -XX:+G1DynamicallyOptimizeYoung? Thought I¹d check if I
can get some feedback before going too far with this.

Here is the code I used.

Patch 1: -Xnoclassgc

This just fixes an issue that -Xnoclassgc doesn¹t work properly using G1
(unfortunately I have yet to get the bug system work to report it...).
With this JVM flag, it should not do class unloading. I had to run my
experiments without class unloading because it killed the optimized
nmethods of the almost always dead objects I want to optimize in DaCapo,
because DaCapo does not retain their class loaders or something.

Patch 2: Dynamic G1 barrier elision

This is where the interesting stuff went if anyone is interested. This is
just a very basic prototype/concept to check if the approach seems
interesting to you guys. You probably want to add stuff like deoptimizing
less (only if there are fields to actually optimize/deoptimize - keep
track of that more accurately), and to sample garbage outside of
System.gc() - this was just a convenience for now, and being more accurate
with which class declared a field, not the canonical class, etc.

Any comments are welcome.


More information about the hotspot-gc-dev mailing list