[15] RFR: 8248048: ZGC: AArch64: SIGILL in load barrier register spilling

Andrew Dinn adinn at redhat.com
Fri Jun 26 10:05:35 UTC 2020

Hi Stefan,

Yes, nice catch. zr is clearly the wrong choice here. In the context of
an FP register it ends up being interpreted as q31 which, as you show,
clashes when r31 is the last register in an odd register set.

Your fix looks ok to me (so count it as reviewed).

Just for your interest, another alternative would be to continue to use
stpq instructions but replace zr with an fp register that is a) in the
save set but b) guaranteed to differ from the last register,-- the
obvious choice being regs[0]. That will work in all cases where count >=
2 (well 3 actually since the problem only arises in odd cases). So, you
would still need to special case count == 0/1 and only add regs[0] to
the set after handling those cases.

I'm agnostic over which of these two is better as I don't think either a
stpq or a strq pre/post is preferable to the other.


Andrew Dinn
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill

On 26/06/2020 09:50, Stefan Karlsson wrote:
> Hi all,
> Please review this patch to fix a ZGC load barrier register spilling bug.
> https://cr.openjdk.java.net/~stefank/8248048/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8248048
> The JVM crashed with an ILL_ILLOPC when executing this instruction in
> our load barrier stub:
>   ldp q31, q31, [sp, #224]
> The entire load barrier stub:
>    0x0000ffff998ab964:    stp    x10, x13, [sp, #-32]!
>    0x0000ffff998ab968:    stp    x14, x17, [sp, #16]
>    0x0000ffff998ab96c:    stp    q1, q2, [sp, #-256]!
>    0x0000ffff998ab970:    stp    q3, q19, [sp, #32]
>    0x0000ffff998ab974:    stp    q20, q21, [sp, #64]
>    0x0000ffff998ab978:    stp    q22, q23, [sp, #96]
>    0x0000ffff998ab97c:    stp    q24, q25, [sp, #128]
>    0x0000ffff998ab980:    stp    q26, q28, [sp, #160]
>    0x0000ffff998ab984:    stp    q29, q30, [sp, #192]
>    0x0000ffff998ab988:    stp    q31, q31, [sp, #224]
>    0x0000ffff998ab98c:    sub    x1, x10, #0x0
>    0x0000ffff998ab990:    mov    x0, x11
>    0x0000ffff998ab994:    mov    x8, #0xfc28                    // #64552
>    0x0000ffff998ab998:    movk    x8, #0xaf11, lsl #16
>    0x0000ffff998ab99c:    movk    x8, #0xffff, lsl #32
>    0x0000ffff998ab9a0:    blr    x8 ; branch into the JVM
>    0x0000ffff998ab9a4:    mov    x11, x0
>    0x0000ffff998ab9a8:    ldp    q3, q19, [sp, #32]
>    0x0000ffff998ab9ac:    ldp    q20, q21, [sp, #64]
>    0x0000ffff998ab9b0:    ldp    q22, q23, [sp, #96]
>    0x0000ffff998ab9b4:    ldp    q24, q25, [sp, #128]
>    0x0000ffff998ab9b8:    ldp    q26, q28, [sp, #160]
>    0x0000ffff998ab9bc:    ldp    q29, q30, [sp, #192]
> => 0x0000ffff998ab9c0:    ldp    q31, q31, [sp, #224]
>    0x0000ffff998ab9c4:    ldp    q1, q2, [sp], #256
>    0x0000ffff998ab9c8:    ldp    x14, x17, [sp, #16]
>    0x0000ffff998ab9cc:    ldp    x10, x13, [sp], #32
>    0x0000ffff998ab9d0:    b    0xffff998aa718
> It seems to be illegal to use the same register twice when loading into
> a pair of registers. I verified that that was the problem, and not the
> usage of zr (see below) that caused some weird encoding, by changing the
> code to always generate stp/ldp with the same register:
> => 0x0000ffff757d22fc:    ldp    q20, q20, [sp, #32] ; Crash here as well
>    0x0000ffff757d2300:    ldp    q21, q21, [sp, #48]
>    0x0000ffff757d2304:    ldp    q22, q22, [sp, #64]
> The code that generates this instruction is MacroAssembler::push_fp,
> which spills the necessary registers in pairs with stp/ldp calls. If the
> number of registers to spill is odd it needs to deal with one of the
> registers separately. This is done by adding a dummy register here:
> 2136 regs[count++] = zr->encoding_nocheck();
> 2137 count &= ~1; // Only push an even number of regs
> This scheme seems to work for the normal registers
> (MacroAssembler::push), but the usage of zr seems dubious when we're
> dealing with the fp/simd version of stp/ldp.
> My proposed patch replaces the stp/ldp for the odd numbered register
> with the single-register versions: str/ldr. I make sure to keep the
> stack 16 bytes aligned by still bumping 16 bytes, but skipping the
> store/load to the second 8 bytes half.
> Note that right now MacroAssembler::push_fp is only used by ZGC.
> This fixes the crash. I've run this code through jtreg groups :tier1,
> tier2, tier3, and an Oracle-internal stress suite without any new problems.
> The smallest reproducer I have is:
> make -C ../build/fastdebug test TEST=test/jdk/java/util/concurrent/
> Does this look OK?
> Thanks,
> StefanK

More information about the hotspot-gc-dev mailing list