Request for reviews (M): 6614597: Performance variability in jvm2008 xml.validation

Vladimir Kozlov Vladimir.Kozlov at Sun.COM
Mon Feb 1 16:42:16 PST 2010

I was paranoid about having too_many_traps check for monomorphic case
since as described in the original big comment we may miss the transition
from monomorphic to bimorphic if counters was not updated before compilation.

So I added the compilation assert for such case and everything passed (JPRT)
except xml.transform in specjvm2008 which has method with a lot of calls
to the same small methods which are inlined. One such method has 4 class_check
traps (it was recompiled once).

As result of this method inlining accumulative (per method) class_check trap
count reaches 100:

<observe trap='class_check' count='4' total='100'/>

and too_many_traps() returns 'true' even if an other call site does not have trap:

<bc code='185' bci='1190'/>
<call method='959' count='6701' prof_factor='1' virtual='1' inline='1' receiver='870' receiver_count='6701'/>
<call method='1106' count='6701' prof_factor='1' inline='1'/>
<observe trap='class_check' count='0' mcount='0' ccount='100'/>
<predicted_call bci='1190' klass='870'/>
<uncommon_trap bci='1190' reason='null_check' action='maybe_recompile'/>
<virtual_call bci='1190'/>

It seems to me, it is overkill. I know why we do this - to avoid a lot of deoptimizations.
But may be we should apply such restriction only for traps not recorded per bytecode?
Or keep track of which trap count is used for accumulative count and avoid this case?

Anyway it is edge case and we may look on it later.
For now I am pushing my changes with too_many_traps().


Tom Rodriguez wrote:
> I would be happy with your current change but just adding in the per bci too many traps check for the appropriate Reason_class_check or Reason_bimorphic.  I think its clearer and will work slightly better.
> tom
> On Jan 29, 2010, at 10:37 AM, Vladimir Kozlov wrote:
>> Actually I am also not sure what would be the best.
>> Refworkload shows no difference for any case.
>> Yes, I don't need Reason_bimorphic but it could help to avoid possible
>> one deoptimization (transition bimorphich to polimorphic).
>> Also we can separate uncommon traps in logs.
>> I can remove it and use its place for Reason_predicate (which should be per bytecode trap).
>> The check too_many_traps() also not needed BUT, as you pointed,
>> if we have Reason_bimorphic we should use it to avoid
>> ONE additional deoptimization and recompilation.
>> Thanks,
>> Vladimir
>> Tom Rodriguez wrote:
>>> I'm a little confused.  I thought you'd only need either Reason_bimorphic or to adjust the way counters are handled, not both?  I'm not against having both since it rationalizes the counters but then why aren't you checking too_many_traps any more?  You've got the counter so why ignore it?  Otherwise you're exposed to the race that's mentioned in the big comment you deleted where the mdo trap count is updated but not the call site counts.
>>> tom
>>> On Jan 28, 2010, at 11:55 AM, Vladimir Kozlov wrote:
>>>> Here is new webrev
>>>> Tom, you are right. I just need to remove too_many_traps() check
>>>> and rely only on counters. We may get an additional deoptimization
>>>> and recompilation in the next rare case but I think it is fine.
>>>> There is time interval between the method deoptimization and
>>>> when interpreter continue execution of the method and increment
>>>> total counter for the call bytecode indicating polimorphic case.
>>>> During this interval an other thread can trigger compilation and
>>>> compiler thread will load MDO which has not updated total counter.
>>>> The parser will still see bimorphic case and generate bimorphic code.
>>>> And I will use Reason_bimorphic to separate traps.
>>>> Thanks,
>>>> Vladimir
>>>> Tom Rodriguez wrote:
>>>>> I assumed you intended to use the state of the counters to distinguish mono from bi, i.e. if the total count is greater than rec1 + rec2 then bimorphic can't work otherwise give it a try.  I agree is might be more clear if Reason_bimorphic were used.  Your changes to make it per bci look fine.
>>>>> tom
>>>>> On Jan 27, 2010, at 6:45 PM, Vladimir Kozlov wrote:
>>>>>> ....!!! I need Reason_bimorphic even with my changes for call counters.
>>>>>> Otherwise uncommon trap will not be generated for bimorphic case
>>>>>> if the first compilation was with monomorphic case after which
>>>>>> we hit class_check trap and recorded it in MDO for this bytecode.
>>>>>> I forgot about it and found again during testing of latest changes.
>>>>>> And I want to have too_many_traps() check per bytecode instead of current per method.
>>>>>> So I am going to place Reason_bimorphic before Reason_unloaded to record it
>>>>>> per bytecode. I think the next changes should be enough.
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>> @@ -33,12 +33,15 @@ class Deoptimization : AllStatic {
>>>>>> enum DeoptReason {
>>>>>>   Reason_many = -1,             // indicates presence of several reasons
>>>>>>   Reason_none = 0,              // indicates absence of a relevant deopt.
>>>>>> +    // Next 7 reasons are recorded per bytecode in DataLayout::trap_bits
>>>>>>   Reason_null_check,            // saw unexpected null or zero divisor (@bci)
>>>>>>   Reason_null_assert,           // saw unexpected non-null or non-zero (@bci)
>>>>>>   Reason_range_check,           // saw unexpected array index (@bci)
>>>>>>   Reason_class_check,           // saw unexpected object class (@bci)
>>>>>>   Reason_array_check,           // saw unexpected array class (aastore @bci)
>>>>>>   Reason_intrinsic,             // saw unexpected operand to intrinsic (@bci)
>>>>>> +    Reason_bimorphic,             // saw unexpected object class in bimorphic inlining (@bci)
>>>>>> +
>>>>>>   Reason_unloaded,              // unloaded class or constant pool entry
>>>>>>   Reason_uninitialized,         // bad class state (uninitialized)
>>>>>>   Reason_unreached,             // code is not reached, compiler
>>>>>> @@ -49,7 +52,7 @@ class Deoptimization : AllStatic {
>>>>>>   Reason_predicate,             // compiler generated predicate failed
>>>>>>   Reason_LIMIT,
>>>>>>   // Note:  Keep this enum in sync. with _trap_reason_name.
>>>>>> -    Reason_RECORDED_LIMIT = Reason_unloaded   // some are not recorded per bc
>>>>>> +    Reason_RECORDED_LIMIT = Reason_bimorphic  // some are not recorded per bc
>>>>>>   // Note:  Reason_RECORDED_LIMIT should be < 8 to fit into 3 bits of
>>>>>>   // DataLayout::trap_bits.  This dependency is enforced indirectly
>>>>>>   // via asserts, to avoid excessive direct header-to-header dependencies.
>>>>>> @@ -279,7 +282,7 @@ class Deoptimization : AllStatic {
>>>>>>                                      int trap_state);
>>>>>> static bool reason_is_recorded_per_bytecode(DeoptReason reason) {
>>>>>> -    return reason > Reason_none && reason < Reason_RECORDED_LIMIT;
>>>>>> +    return reason > Reason_none && reason <= Reason_RECORDED_LIMIT;
>>>>>> }
>>>>>> Vladimir Kozlov wrote:
>>>>>>> Thank you, Tom
>>>>>>> Then I will go ahead to implement it.
>>>>>>> Vladimir
>>>>>>> Tom Rodriguez wrote:
>>>>>>>> I think you're right that part of the problem is that even though we maintain 3 counters, the total is supposed to be related to the other two but since they aren't updated atomically they can actually be out of sync by an almost arbitrary amount.  So I think you're suggesting that instead of maintaining rec1, rec2 and total, we should maintain rec1, rec2 and other.  The site count will be the sum of rec1, rec2 and other.  Another nice thing about this would be that we'd only be updating a single counter for the call site instead of normally doing two.  The other way to handle this would be to set TypeProfileWidth to 3.  Either way, if the profile itself could reliably indicate more than 2 receiver types we wouldn't need to have a new Reason.  I like the idea of a solution like this.
>>>>>>>> tom
>>>>>>>> On Jan 25, 2010, at 10:24 AM, Vladimir Kozlov wrote:
>>>>>>>>> Reason_bimorphic will not cover the case when total counter
>>>>>>>>> is larger then sum of 2 receivers in real bimorphic case.
>>>>>>>>> What I am saying is currently the only reliable information
>>>>>>>>> these counters can give us is monomorphic case. We can not
>>>>>>>>> relay on them (in current implementation) to distinguish
>>>>>>>>> bimorphic case from polimorphic case. This is why Reason_bimorphic
>>>>>>>>> helps. If we use counters differently we don't need Reason_bimorphic.
>>>>>>>>> Vladimir
>>>>>>>>> John Rose wrote:
>>>>>>>>>> On Jan 22, 2010, at 10:06 PM, Vladimir Kozlov wrote:
>>>>>>>>>>> The third solution (sorry they are still popping up) will be using
>>>>>>>>>>> the total counter in MDO only for missing (third receiver) cases.
>>>>>>>>>>> It will allow to keep current type profile width and distinguish
>>>>>>>>>>> bimorphic case from polimorphic.
>>>>>>>>>> I like Tom's idea of mapping Reason_bimorphic to an unrelated reason, such as Reason_range_check.
>>>>>>>>>> -- John

More information about the hotspot-compiler-dev mailing list