Request for reviews (M): 6987135: Performance regression on Intel platform with 32-bits edition between 6u13 and 6u14.

Vladimir Kozlov vladimir.kozlov at
Mon Nov 1 11:40:07 PDT 2010

Thank you, Tom


Tom Rodriguez wrote:
> That looks good.
> tom
> On Oct 29, 2010, at 5:14 PM, Vladimir Kozlov wrote:
>> Tom suggested to use second TEMP register instead of push/pop. Here is new webrev:
>> Vladimir
>> Vladimir Kozlov wrote:
>>> Thank you, Tom
>>> I updated webrev:
>>> Tom Rodriguez wrote:
>>>> On Oct 22, 2010, at 6:25 PM, Vladimir Kozlov wrote:
>>>>> Fixed 6987135: Performance regression on Intel platform with 32-bits edition between 6u13 and 6u14.
>>>>> Changes for 6603011 added the conversion of long
>>>>> division by constant to the code with multiply.
>>>>> But some modern cpus improved DIV instruction
>>>>> performance. Use it for long division by constant
>>>>> when it is faster than code with multiply.
>>>> The formats in don't match the code.
>>> Fixed.
>>>> In modL_eReg_imm32, why can't the value be 0 or -1?
>>> There are Ideal transformations for such divisor values,
>>> DivL and ModL will never go to matcher with such divisor.
>>> Asserts verify it.
>>>> Why don't you use an immL definition that ensures that?  If imm is MININT then the pcon calculation will go wrong.
>>> Matcher::use_asm_for_ldiv_by_con() has check for MININT.
>>> I added verification check into asserts.
>>>> I believe you could do the register declarations like this:
>>> I updated both ModL and DivL code to use only dst.
>>> Thanks,
>>> Vladimir
>>>> + instruct modL_eReg_imm32( eADXRegL dst, eRegL src, immL32 imm, eRegI tmp, eFlagsReg cr ) %{
>>>> +   match(Set dst (ModL src imm));
>>>> +   effect(TEMP dst, TEMP tmp, KILL cr );
>>>> to leave the src and tmp unbound which would give the RA a little more freedom.  Actually wouldn't connecting src and dst directly result in fewer moves in the normal case?  You might need a new temp but it seems like there are quite a few moves of src into dst for the idivl.
>>>> + instruct modL_eReg_imm32( eADXRegL dst, immL32 imm, eSIRegI tmp, eFlagsReg cr ) %{
>>>> +   match(Set dst (ModL dst imm));
>>>> +   effect( KILL cr );
>>>> tom
>>>>> Tested on US3, T1, T2, Sparc64, AMD and Intel latest cpus.

More information about the hotspot-compiler-dev mailing list