RFR: 8187033: [PPC] Imporve performance of ObjectStreamClass.getClassDataLayout()

Kazunori Ogata OGATAK at jp.ibm.com
Fri Sep 1 06:53:23 UTC 2017

Hi Aleksey and Hans,

Thank you for your comments.  I'll try to see how much Unsafe approach 
improves performance.

I'm now thinking the approach to use final that Aleksey mentioned in the 
first reply is a good one.  I checked the JIT generated code.  It puts 
MEMBAR-store-store and MEMBAR-release before storing ClassDataSlot array 
to the final field "slots", and there is no MEMBAR when reading dataLayout 
or dataLayout.slots.  Since dataLayout is heavily read than written, I 
think it is preferable if we can put all overhead into the writing side. 
But I'll see how performance changes with lwsync on reading it.

    private DataLayout dataLayout;

     * Class to ensure elements of ClassDataSlot[] are visible to other
     * threads. The "final" qualifier of the variable slots is necessary.
    private static class DataLayout {
        final ClassDataSlot[]  slots;
        DataLayout(ClassDataSlot[] s) {
            slots = s;

    ClassDataSlot[] getClassDataLayout() throws InvalidClassException {
        // REMIND: synchronize instead of relying on volatile?
        if (dataLayout == null) {
            ClassDataSlot[]  slots = getClassDataLayout0();
            dataLayout = new DataLayout(slots);
        return dataLayout.slots;


From:   Aleksey Shipilev <shade at redhat.com>
To:     Hans Boehm <hboehm at google.com>
Cc:     Kazunori Ogata <OGATAK at jp.ibm.com>, core-libs-dev 
<core-libs-dev at openjdk.java.net>
Date:   2017/09/01 15:14
Subject:        Re: RFR: 8187033: [PPC] Imporve performance of 

On 08/31/2017 09:39 PM, Hans Boehm wrote:
>> I guess you can make VarHandle.fullFence() between 
getClassDataLayout0() and storing it to the 
>> non-volatile field...
> ... with the usual warning about this sort of thing:
> According to the JMM, it's not guaranteed to work, because the 
reader-side guarantees are not
> there. In practice, you're relying on dependency-based ordering, which 
the compiler is currently
> unlikely to break in this case. But future implementations may.


> I presume the real concern here is the cost on the reader side? 
Presumably that could be reduced 
> with a VarHandle getAcquire(). I believe that eliminates the 
heavy-weight sync, and just keeps
> an lwsync. Imperfect, but better.
Oh right! This is exactly why acq/rel exist. Since OSC is a heavily used 
class, the Unsafe
counterparts might be better:

private static final Unsafe U = ...;
private static final long CDS_OFFSET = U.objectFieldOffset(...);

private volatile ClassDataSlot[] dataLayout; // volatile for safety of 
naked reads

ClassDataSlot[] getClassDataLayout() throws InvalidClassException {
  ClassDataSlot[] slots = U.getObjectAcquire(this, CDS_OFFSET);
  if (slots == null) {
    slots = getClassDataLayout0();
    U.putObjectRelease(this, CDS_OFFSET, slots);
  return slots;

Ogata, please try if that indeed helps on POWER?


[attachment "signature.asc" deleted by Kazunori Ogata/Japan/IBM] 

More information about the core-libs-dev mailing list