RFR(S): 8225670: compiler/types/correctness/* tests fail with "assert(recv == __null || recv->is_klass()) failed: wrong type"
erik.osterlund at oracle.com
Fri Aug 9 09:21:17 UTC 2019
Looks good - well spotted.
To answer your question - yes the GC (ZGC in particular, and probably
soon Shenandoah when they hook in to the concurrent class unloading
framework) cleans the extra data section of MDOs concurrently, under the
extra data lock.
However, they clear whole rows from the extra data section, under the
extra data lock of the MDOs; they never write that the Klass is NULL. So
I believe this bug only relates to the use of the WhiteBox API.
The row clearing of concurrent GCs synchronizes with a metadata
preparation phase for unpacking MDOs to ciMDOs. The preparation phase
will in a fixed-point iteration try to create ci handles for all
encountered metadata under the extradata lock. Every time it encounters
an uncached metadata instance, it has to release the lock due to ranking
issues, and may also run into safepoints then. Such situations are
detected, triggering a restart of the fixed-pont iteration.
Once the fixed-point iteration has finished, we know that we under the
lock walked all metadata in the extra data section without ever
releasing the lock, have ci handles keeping all metadata alive, and
can't have gotten any safepoints due to being in VM state. After that,
the rows are copied and translated, and now we are guaranteed that the
translation will always already have the ci handles cached.
There is some random original copy of the raw MDO extra data that is
performed before preparing the metadata. I don't think it is really used
or needed. Might be interesting to remove in a future RFE. It gets
overwritten by the subsequent row-by-row processing after metadata
On 2019-08-08 13:38, Christian Hagedorn wrote:
> Please review the following patch:
> As I reproduced the bug with , recv still pointed to an InstanceKlass
> object at some point which was not translated to a ciInstanceKlass and
> therefore the is_klass() check was false which made the assert fail. The
> problem can be traced back to the concurrent forced clearing of method
> data  using the whitebox API while compilation uses this profile data
> to create a ciProfileData wrapper. The profile data, lets say 'pd', is
> first completely copied  and then translated  to a ci version
> 'cipd'. Therefore, 'cipd' initially only contains InstanceKlass entries.
> During this translation, we read instanceKlasses from 'pd'  and
> translate them into ciInstanceKlasses to store them in 'cipd'. However,
> while translating,  can delete a non-NULL InstanceKlass entry in 'pd'
> by setting it to NULL. As a result, the non-NULL check  fails and
> nothing is updated in 'cipd'. 'cipd' still contains a non-translated
> non-NULL InstanceKlass entry from 'pd' which later triggers the
> assertion failure.
> The fix is straight forward to also clear an entry in the ciProfileData
> object if a klass is NULL in the ProfileData object. One question
> remains that I could not figure out yet: Can the method profile data be
> cleared while a method is compiled or is this only a problem specific to
> this test using a forced clear through the whitebox API?
> Best regards,
More information about the hotspot-compiler-dev