RFR: 8204540: Automatic oop closure devirtualization

Stefan Karlsson stefan.karlsson at oracle.com
Wed Jun 20 10:50:00 UTC 2018

Hi all,

Please review this patch to get rid of the macro based oop_iterate 
devirtualization layer, and replace it with a new implementation based 
on templates that automatically determines when the closure function 
calls can be devirtualized.


The ExtendedOopClosure is the base class of GC closures that are usable 
by the oopDesc::oop_iterate and friends. The oop_iterate functions call 
the implementation of the virtual ExtendedOopClosure::do_oop, of the 
passed in closure. As part of this patch, this closure is renamed 
OopIterateClosure to aid readers of the code to better understand the 
purpose of this closure.

Today there exists a devirtualization layer, which looks at the static 
type of the closure and turns the call to the virtual do_oop into 
direct, inlinable calls. This used to be fully implemented by macros, 
but was partly replaced by templates in JDK-8075955. Because 
oopDesc::oop_iterate calls the *virtual* Klass::oop_oop_iterate 
functions, it's non-trivial to pass the closure type all the way through 
to the do_oop function calls. Therefore, overloads of 
Klass::oop_oop_iterate for all devirtualized closures were generated by 

The generation of these overloads is finicky and requires the developer 
to split the closure up into different files and get them registered 
correctly in the specialized_oops_closure.hpp.

It also means that we generate implementations of some of the 
oop_oop_iterate for all closures, even when they are not used. There's 
even code to exclude compilation of some of these functions when only 
SerialGC is built.

I propose a new way to automatically devirtualize these do_oop calls 
when the compiler detects that static closure type is known to have the 
do_oop function implemented.

This will get rid of all these macros, get rid of the 
*_specialized_oop_closure.* files, stop generating GC specific overloads 
into oopDesc and the *Klass classes, get rid of virtual calls for more 
closures, remove the _nv suffixes and template parameters.

For this to work we need to impose a contract/convention that no 
OopIterateClosure sub class overrides an implementation of do_oop in any 
of it's ancestors. Going forward, when we move to a newer C++ standard, 
we'll be able to mark the do_oop functions as final, to prevent 
accidental overrides. There might also be a way to automatically detect 
such breaches with template metaprogramming, but that has not been 

The discussion above has been about the do_oop functions, but this also 
applies to the metadata functions as well (do_metadata, do_cld, do_klass).

The proposed patch implements one dispatch table per OopIterateClosure 
(and oop_iterate variant) that gets used and devirtualized. Each such 
table has one entry per sub class of Klass. All these tables get 
generated during compile time, and installed when the static 
initializers are run. This gives us a single-call dispatch over the 
static type of the oop closure type and the dynamic type of the Klass 
sub class.

The oop iteration functions also dispatch on UseCompressedOops, and the 
patch replaces those checks with an initialization-time replacement of 
the dispatch table, which installs the correct implementation (with or 
without compressed oops), making this a single-call dispatch over three 
dimensions. This is similar to how the RuntimeDispatch is done for the 
Access API.

With this RFE the developers will be able to do the following to get 
devirtualized the code in do_oop inlineable into *Klass::oop_oop_iterate:

class MarkingClosure : public MetadataVisitingOopIterateClosure {
   virtual void do_oop(oop* p) { marking(p); }
   virtual void do_oop(narrowOop* p) { marking(p); }
MarkingClosure cl;

I've described the mechanism in more detail in iterate.inline.hpp. See 
the two larger comment blocks in:

This has been tested with mach5 tier{1,2,3,4,5,6,7}.

There are a few cleanup changes that would be good to handle after this 
change. For example:
* Minimize OopIterateClosure and get rid of idempotent()
* Don't automatically call verify() from oop_oop_iterate, but let the 
closures do their own verification.
* Fix include cycle between functions in objArrayKlass.inline.hpp and 
* Remove (or at least rename) NoHeaderExtendedOopClosure and 
* Remove the UseCompressedOops checks that were pushed to the Parallel 
GC object visitors (oop_pc_* and oop_ps_* functions). This is already 
being worked on in JDK-8201436.


More information about the hotspot-dev mailing list