RFR: JDK-8251397 NPE on ClassValue.ClassValueMap.cacheArray
paul.sandoz at oracle.com
Wed Aug 19 15:57:59 UTC 2020
Thanks for doing the fix and the work verifying.
I also verified using a fence fixes the jcstress test. Similarly, I found I could only reproduce the issue in HotSpot when running a test more like ClassValue (the double checked locking pattern when publishing to a plan field) with the options -XX:+StressLCM -XX:+StressGCM. I analyzed assembler output (-XX:+PrintOptoAssembly) to observe the publish store occurring before some stores to fields. It’s highly likely that with HotSpot we got lucky in this case :-)
I don’t object to the use of a release fence (LoadStore|StoreStore), but I believe we could also use a StoreStore fence in this case. Perhaps the ARM folks have a stronger opinion on this?
I marginally prefer placing the fence in the initializeMap, since the relationship between construction and publish is very clear. In either case I recommend adding a comment on why the fence is required e.g.:
// Place a Store fence as the last operation of the constructor to emulate
// ClassValueMap containing final fields. This ensures it can be
// published safely in the non-volatile field Class.classValueMap, (see initializeMap)
// since stores to the fields of ClassValueMap will not be reordered
// to occur after the store to the field type.classValueMap
I also noticed we can change the field Class.classValueMap to be @Stable, but I think we should follow up on that investigation separately. Relatedly, I had an idea to modify HotSpot so it places a StoreStore fence directly before the store to a @Stable field.
> On Aug 19, 2020, at 4:53 AM, Galder Zamarreno <galder at redhat.com> wrote:
> Hi all,
> I've created patch  to fix the ClassValue$ClassValueMap NPE bug in .
> The bug has been discussed by other members in the list in thread .
> The patch follows the simple fix suggested by Doug and others in that
> exchange, e.g. . That is, it adds a release fence
> to ClassValue$ClassValueMap constructor to avoid the NPE.
> To verify the fix, I ran the jcstress test that Paul posted in  and
> played around with the difference fixes suggested in the thread. Adding the
> release fence did indeed fix the jcstress test.
> To further verify the issue, I've successfully run both the tier1 tests and
> the Quarkus native testsuite with a Mandrel 20.1 built with JDK 11.0.8
> version patched with the fix (higher JDKs not supported yet). Note that
> this NPE happens on rare occasions.
> The patch applies cleanly to JDK 11.
>  Webrev:
>  Bug: https://bugs.openjdk.java.net/browse/JDK-8251397
More information about the core-libs-dev