RFR (L): 8060025: Object copy time regressions after JDK-8031323 and JDK-8057536
thomas.schatzl at oracle.com
Wed Dec 3 15:28:35 UTC 2014
On Wed, 2014-12-03 at 13:41 +0100, Thomas Schatzl wrote:
> Hi all,
> I would like to have reviews for the following change that improves
> object copy time after we noticed performance regressions after the
> changes in JDK-8031323 (alignment of survivor objects) and JDK-8057536
> (context specific allocations).
> In conjunction with JDK-8064473 (Improved handling of age during object
> copy) the changes improve object copy time by ~8% on x64/linux and ~7%
> on SPARC/solaris on SPECjbb2005.
> There are no particular improvements on the scores though as there is
> very little GC work done.
> There seems to be some overall performance gain on CRM Fuse.
> The changes include:
> - merging of the FastCSetTable table with the GCAllocPurpose into a
> table of in_cset_state_t. Each element not only contains information
> about whether the region is humongous or not, but also what generation
> it belongs to if it is in the collection set.
> The encoding has been selected to allow good instruction encoding of
> commonly used checks (e.g. in collection set or not, is humongous).
> GCAllocPurpose has been removed.
> - factor out plab allocation as fast-path for allocation from other
> types of allocations. There have been a few renamings of methods to
> (imo) make the various stages more clear. (i.e. The methods are not all
> called "allocate" any more :))
> - use a per-ParThreadScanState tenuring threshold.
> - only calculate object age if required.
> - some additional direct use of markOop contents instead of accessing
> via the oop (like in JDK-8064473).
> - manually extract some common subexpressions from the code that are not
> obvious to the compiler.
> There is no change in functionality, and the survivor alignment check
> still has some minor performance impact. However imo these changes in
> total outweigh its impact, so further attempts to factor this out (e.g.
> templatizing) do not seem to have a good cost/benefit ratio.
> We may still want to create an RFE that deals with that in a separate
> change. There is enough good change in this change already to warrant
> separate CRs if needed.
> This work is largely based on changes from Tony Printezis at Twitter who
> coincidentally has been working on this issue at the same time, and has
> then been tweaked further (Thanks a lot!). Extensive performance testing
> of many variants (of which this seems to be the best) has been performed
> on internal test systems.
> Tony reported even better improvements on some microbenchmarks on the
> original version of the change.
> As mentioned, unless the application is somewhat GC and object copy
> heavy, there will not be much impact.
I got made aware of that I messed up the CR# in the link: It's
https://bugs.openjdk.java.net/browse/JDK-8060025 . Not sure how I got to
More information about the hotspot-gc-dev