Dynamic G1 barrier elision for C2 in young
vitalyd at gmail.com
Thu Jun 11 13:26:01 UTC 2015
I think this is a neat idea. There's a class of java programs that don't
trigger any GC while up (or very few GCs at specific points in time).
Typically these are the same types of apps that pool objects on free lists,
and thus incur write barrier overhead for no actual gain. FWIW, I'd
welcome such elision but would also like to see it for the parallel
sent from my phone
On Jun 8, 2015 6:00 AM, "Erik Österlund" <erik.osterlund at lnu.se> wrote:
> Since this concerns compiler too, I decided to CC hotspot-dev.
> Den 06/06/15 02:44 skrev Erik Österlund <erik.osterlund at lnu.se>:
> >Hi guys,
> >Making G1 run faster on GC-tuned applications that are designed to only
> >rarely spill objects into old, seems like an interesting and important
> >optimization goal at the moment.
> >Today I tried an interesting experiment. I sample garbage during the
> >sweeping phase (phase 2) of System.gc() (G1MarkSweep) that stumbles
> >through garbage anyway, hoping to find classes with instances that are
> >used all the time, but /never/ make it into old. Then I deoptimize these
> >classes and recompile the relevant nmethods depending on the class to
> >elide the G1 write barriers (in C2). If the GC eventually needs to promote
> >any of these objects to old, I just deoptimize again and recompile with G1
> >barriers turned back on.
> >On some DaCapo benchmarks, it payed off very well for a few benchmarks
> >that supposedly use many temporary objects:
> >fop: -9.2% time <- this one was brutal!!
> >xalan: -6.9% time
> >jython: -5.9% time
> >Results were measured with 40 warmup iterations, and then computed the
> >average of the following 10 iterations, so 50 iterations in total. Class
> >unloading was turned off (using my own patch to make -Xnoclassgc work,
> >because it seems to be broken currently) and 512M heaps.
> >The G1 barriers are already optimized to be faster for young objects, but
> >if the GC finds out that certain types of objects /never/ get old, telling
> >the compiler so allows complete elision of both the pre and post barriers
> >from the code which is nice.
> >Are we conceptually interested in such a solution, potentially accompanied
> >with a flag like -XX:+G1DynamicallyOptimizeYoung? Thought I¹d check if I
> >can get some feedback before going too far with this.
> >Here is the code I used.
> >Patch 1: -Xnoclassgc
> >This just fixes an issue that -Xnoclassgc doesn¹t work properly using G1
> >(unfortunately I have yet to get the bug system work to report it...).
> >With this JVM flag, it should not do class unloading. I had to run my
> >experiments without class unloading because it killed the optimized
> >nmethods of the almost always dead objects I want to optimize in DaCapo,
> >because DaCapo does not retain their class loaders or something.
> >Patch 2: Dynamic G1 barrier elision
> >This is where the interesting stuff went if anyone is interested. This is
> >just a very basic prototype/concept to check if the approach seems
> >interesting to you guys. You probably want to add stuff like deoptimizing
> >less (only if there are fields to actually optimize/deoptimize - keep
> >track of that more accurately), and to sample garbage outside of
> >System.gc() - this was just a convenience for now, and being more accurate
> >with which class declared a field, not the canonical class, etc.
> >Any comments are welcome.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-gc-dev