RFR (S) 8131682: C1 should use multibyte nops everywhere
dean.long at oracle.com
Tue Jul 21 20:28:12 UTC 2015
This version looks good.
On 7/20/2015 7:51 AM, Aleksey Shipilev wrote:
> Hi Dean,
> Thanks for taking a look!
> Silly me, I should have left the call patching cases intact, because
> you're right, we should be able to patch the nops partially while still
> producing the correct instruction stream. Therefore, I reverted the
> cases where we do nop-ing for *instruction* patching, and added the
> comment there.
> Other places seem to use the nop sequences to provide the alignment, not
> for the general patching. Especially interesting for us is the case of
> aligning the patcheable immediate in the existing call. C2 does the nops
> in these cases.
> New webrev:
> * JPRT -testset hotspot on open platforms;
> * Targeted benchmarks, plus eyeballing the assembly;
> On 18.07.2015 10:51, Dean Long wrote:
>> I think we should distinguish the different uses and treat them
>> 1) padding nops for patching, executed
>> We need to be careful about inserting a fat nop here, if later patching
>> overwrites only part of the fat nop, resulting in an illegal intruction.
>> 2) padding nops for patching, never executed
>> It should be safe insert a fat nop here, but there's no point if the
>> nops are not reachable and never executed.
>> 3) alignment nops, never patched, executed
>> Fat nops are fine, but on some CPUs branching may be even better, so I
>> suggest using align() for this, and letting align() decide what to
>> generate. The change in check_icache() could use a version of align
>> that takes the target offset as an argument:
>> 348 align(CodeEntryAlignment,__ offset() + ic_cmp_size);
>> 4) alignment nops, never patched, never executed
>> Doesn't matter what we emit here, but we might as well make it
>> understandable by humans using a debugger.
>> I believe the patching nops in c1_CodeStubs_x86.cpp and
>> c1_LIRAssembler.cpp are patched concurrently while the code is running,
>> not at a safepoint, so it's not clear to me if it's safe to use fat nops
>> on x86. I would consider those changes unsafe on x86 without further
>> analysis of what happens during patching.
>> On 7/17/2015 6:29 AM, Aleksey Shipilev wrote:
>>> Hi there,
>>> C1 is not very good at inlining and intrisifying methods, and hence the
>>> call performance is important there. One nit that we can see in the
>>> generated code on x86 is that C1 uses the single-byte nops, even for
>>> long nop strides.
>>> This improvement fixes that:
>>> - JPRT -testset hotspot on open platforms
>>> - eyeballing the generated assembly with -XX:TieredStopAtLevel=1
>>> (I understand the symmetric change is going to be needed in closed
>>> parts, but let's polish the open part first).
More information about the hotspot-compiler-dev