RFR(XS): 8227605: Kitchensink fails "assert((((klass)->trace_id() & (JfrTraceIdEpoch::leakp_in_use_this_epoch_bit())) != 0)) failed: invariant"
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed Jul 31 12:54:45 UTC 2019
On 7/31/19 5:48 AM, Markus Gronlund wrote:
> Hi David,
> Thank you for taking a look (yet again).
> About 4:
> "... so it seems to me either all the load-acquires should go, or else the load-acquires all stay and we add the missing release-store."
> Yes and thank you for spotting, I had missed taking out the load-acquire in the set_bits_cas_form.
> Here is an updated webrev: http://cr.openjdk.java.net/~mgronlun/8227605/webrev02/
Nice work here!
I'll have to take another look at my own use of load_acquire() in the
face of cmpxchg() in my lock free monitor list changeset. :-)
> -----Original Message-----
> From: David Holmes
> Sent: den 31 juli 2019 07:36
> To: Markus Gronlund <markus.gronlund at oracle.com>; hotspot-jfr-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: Re:  RFR(XS): 8227605: Kitchensink fails "assert((((klass)->trace_id() & (JfrTraceIdEpoch::leakp_in_use_this_epoch_bit())) != 0)) failed: invariant"
> Hi Markus,
> On 31/07/2019 7:04 am, Markus Gronlund wrote:
>> Kindly asking for reviews for the following changeset:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8227605
>> Webrev: http://cr.openjdk.java.net/~mgronlun/8227605/webrev01/
>> Clearing a bit that was set in a previous epoch should be done using CAS not to lose information in the current (this) epoch. This has also been the case up to the changes done in relation to Memory Leak Profiler, where the bit tagging scheme and implementation changed quite substantially. Part of the modifications done there had set_traceid_mask() to not use CAS unfortunately. This is the reason for the assertion, as information about the current (this) epoch was lost.
>> We need to restore set_traceid_mask() to use CAS the way it was done originally.
> AFAICS you have:
> 1. Refactored the CAS code using a template function to avoid code duplication
> That seems okay.
> 2. Changed SET_LEAKP_USED_PREV_EPOCH to use SET_LEAKP_TAG_CAS
> Okay that ensures CAS is used to update the epoch as per your summary.
> 3. Modified set_mask to use CAS
> Okay - as per summary
> 4. Removed use of load_acquire in the non-CAS form of set_bits
> This raises some queries about the use of OrderAccess in this code. On the one hand I might expect the load-acquire to be necessary to ensure correct ordering with respect to the implicit release_store of the CAS form. But if we expect correct interaction between the CAS and non-CAS forms then set_bits should itself be using a release-store. Further, given set_bits is not using a release-store, the initial load-acquire in the CAS form is not necessary. So it seems to me either all the load-acquires should go, or else the load-acquires all stay and we add the missing release-store.
>> Thanks to Erik Gahlin for debugging.
More information about the hotspot-runtime-dev