RFR: JDK-8221766: Load-reference barriers for Shenandoah
rkennke at redhat.com
Tue Apr 2 21:12:40 UTC 2019
(I am cross-posting this to build-dev and compiler-dev because this
contains some (trivial-ish) shared build and C2 changes. The C2 changes
are almost all reversals of Shenandoah-specific paths that have been
introduced in initial Shenandoah push.)
I would like to propose that we switch to what we came to call 'load
reference barrier' as new barrier scheme for Shenandoah GC.
The main difference is that instead of ensuring correct invariant when
we store anything into the heap (e.g. read-barrier before reads,
write-barrier before writes, plus a bunch of other stuff), we ensure the
strong invariance on objects when they get loaded, by employing what is
currently our write-barrier.
The reason why I'm proposing it is:
- simpler barrier interface
- easier to get good performance out of it
==> good for upcoming Graal (sup)port
- reduced maintenance burden (I intend to backport it all the way)
This has a number of advantages:
- Strong invariant means it's a lot easier to reason about the state of
GC and objects
- Much simpler barrier interface. Infact, a lot of stuff that we added
to barrier interfaces after JDK11 will now become unused: no need for
barriers on primitives, no need for object equality barriers, no need
for resolve barriers, etc. Also, some C2 stuff that we added for
Shenandoah can now be removed again. (Those are what comprise most
shared C2 changes.)
- Optimization is much easier: we currently put barriers 'down low'
close to their uses (which might be inside a hot loop), and then work
hard to optimize barriers upwards, e.g. out of loops. By using
load-ref-barriers, we would place them at the outermost site already.
Look how much code is removed from shenandoahSupport.cpp!
- No more need for object equals barriers.
- No more need for 'resolve' barriers.
- All barriers are now conditional, which opens up opportunity for
further optimization later on.
- we can re-enable the fast JNI getfield stuff
- we no longer need the nmethod initializer that initializes embedded
oops to to-space
- We no longer have the problem to use two registers for 'the same'
value (pre- and post-barrier).
The 'only' optimizations that we do in C2 are:
- Look upwards and see if barrier input indicates we don't actually need
the barrier. Would be the case for: constants, nulls, method parameters,
etc (anything that is not like a load). Even though we insert barriers
after loads, you'd be surprised to see how many loads actually disappear.
- Look downwards to check uses of the barrier. If it doesn't feed into
anything that requires a barrier, we can remove it.
Performance doesn't seem to be negatively impacted at all. Some
benchmarks benefit positively from it.
Testing: Testing: hotspot_gc_shenandoah, SPECjvm2008, SPECjbb2015, all
of them many times. This patch has baked in shenandoah/jdk for 1.5
months, undergone our rigorous CI, received various bug-fixes, we have
had a close look at the generated code to verify it is sane. jdk/submit
job expected good before push.
Can I please get reviews for this change?
More information about the build-dev