RFR: 8201436: Replace oop_ps_push_contents with oop_iterate and closure
leo.korinth at oracle.com
Thu Oct 4 16:20:15 UTC 2018
This is the first enhancement in a series of three to remove Parallel GC
specific behaviour from shared code. I will post all three enhancements
to the mailing list.
JDK-8201436 Replace oop_ps_push_contents with oop_iterate and closure
JDK-8211446 Replace oop_pc_follow_contents with oop_iterate and closure
JDK-8211447 Replace oop_pc_update_pointers with oop_iterate and closure
The webrevs are to be applied in the same order as in the above list
although logically they are quite independent of each other.
The aim is to remove Parallel GC specific behaviour by removing
oop_ps_push_contents, oop_pc_follow_contents and oop_pc_update_pointers
from the *Klass hierarchy in oops/*Klass.hpp. Instead of using the old
dynamic dispatch on the Klass object, the generic oop_iterate* (used by
all other GCs) is used instead.
- removing GC specific implementation from shared code. Better
separation, less #ifdefs
- reusing the same iterators everywhere, using the same mechanism to
iterate in the different GCs
- reduction in code
Most of the benchmarks looks good, they show about the same performance
before and after the changes. I have had serious problems with the Derby
benchmark that did not like the way the generic reference handling was
visiting objects. Changing the order in oop_oop_iterate_reverse
introduced regressions in G1. Therefore I have added new
oop_oop_iterate_reverse specialisations to speed up Parallel (reduce
regression) without harming G1.
Ideas of things to do after this commit series:
- minor improvements in oop_oop_iterate_reverse (removing return of size
that is not used anywhere)
- remove dynamic dispatch call to check if klass is of typeArray "kind"
(!obj->klass()->is_typeArray_klass()). The empty action on type arrays
is already handled and this extra check seems mostly confusing and
possibly a small unnecessary cost.
- possibly move creation of push closure out of push_contents, and by
that win some performance (the new closure now has to set a reference
- all patches build on linux.
- mach5 (linux, windows and mac) tier1,tier2,tier3,tier4,tier5 has
passed with all three patches applied
- aurora tests have been run and seem to indicate no change in general,
and only a small regression in Derby. The last (Derby) run showed a
young collection regression of 1.02% on Linux and 2.76% improvement on
windows, but including experience from earlier runs, my estimate is that
it is in reality a bit worse and that we have a _slight_ but acceptable
regression in Derby.
More information about the hotspot-gc-dev