RFR (M) CR 8050147: StoreLoad barrier interferes with stack usages
dave.dice at oracle.com
Mon Aug 11 12:29:46 UTC 2014
On 2014-8-11, at 4:18 AM, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
> On 08/08/2014 11:38 PM, John Rose wrote:
>> On Aug 8, 2014, at 5:01 AM, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
>>> On 08/08/2014 02:57 AM, John Rose wrote:
>>>>> Vladimir pointed where to look for frame structure , but I
>>>>> still haven't parsed it to make an educated guess about how much to
>>>>> step back. Any ideas?
>>>> Callee saves will get spilled in the general spill area, IIRC. That
>>>> will be near the callee SP, which is unpredictable.
>>>> I don't see any area in the generic frame layout which is reliably
>>>> better than SP - CLSize. Maybe SP - MaxTinyImmediateOffset, or the
>>>> min of the two.
>>> Okay, let's go with this one then:
>> Why 8+CLSize (40/72/136) instead of just CLSize (32/64/128)?
>> Is there usually something hovering at sp(-8), like a frequently pushed temp?
> The original experiment was taken without any knowledge if SP was
> aligned to >8 or not. If 8-byte read from SP(0) splits the cache line,
> then 8-byte read from SP(-CL) also splits the cache line *and* shares it
> with SP(0). Additional 8-byte push back was to dodge this. But, it
> appears the SP is aligned to 16 bytes?
IIRC all compilers keep at least 8 byte alignment, but that number has been increasing in order to better allow SSE/MMX auto variables. Hopefully we’ll never see cases where a misaligned atomic falls on two underlying lines. Intel supports this for legacy operations, but effectively they quiesce the system in order to do it. See https://blogs.oracle.com/dave/entry/qpi_quiescence. We could always switch to a locked byte add, but that can cause stalls if we re-access the same location in short order as a non-byte.
> If so, we can go with this:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-compiler-dev