RFR: 8248048: ZGC: AArch64: SIGILL in load barrier register spilling
stefan.karlsson at oracle.com
Fri Jun 26 08:50:53 UTC 2020
Please review this patch to fix a ZGC load barrier register spilling bug.
The JVM crashed with an ILL_ILLOPC when executing this instruction in
our load barrier stub:
ldp q31, q31, [sp, #224]
The entire load barrier stub:
0x0000ffff998ab964: stp x10, x13, [sp, #-32]!
0x0000ffff998ab968: stp x14, x17, [sp, #16]
0x0000ffff998ab96c: stp q1, q2, [sp, #-256]!
0x0000ffff998ab970: stp q3, q19, [sp, #32]
0x0000ffff998ab974: stp q20, q21, [sp, #64]
0x0000ffff998ab978: stp q22, q23, [sp, #96]
0x0000ffff998ab97c: stp q24, q25, [sp, #128]
0x0000ffff998ab980: stp q26, q28, [sp, #160]
0x0000ffff998ab984: stp q29, q30, [sp, #192]
0x0000ffff998ab988: stp q31, q31, [sp, #224]
0x0000ffff998ab98c: sub x1, x10, #0x0
0x0000ffff998ab990: mov x0, x11
0x0000ffff998ab994: mov x8, #0xfc28 // #64552
0x0000ffff998ab998: movk x8, #0xaf11, lsl #16
0x0000ffff998ab99c: movk x8, #0xffff, lsl #32
0x0000ffff998ab9a0: blr x8 ; branch into the JVM
0x0000ffff998ab9a4: mov x11, x0
0x0000ffff998ab9a8: ldp q3, q19, [sp, #32]
0x0000ffff998ab9ac: ldp q20, q21, [sp, #64]
0x0000ffff998ab9b0: ldp q22, q23, [sp, #96]
0x0000ffff998ab9b4: ldp q24, q25, [sp, #128]
0x0000ffff998ab9b8: ldp q26, q28, [sp, #160]
0x0000ffff998ab9bc: ldp q29, q30, [sp, #192]
=> 0x0000ffff998ab9c0: ldp q31, q31, [sp, #224]
0x0000ffff998ab9c4: ldp q1, q2, [sp], #256
0x0000ffff998ab9c8: ldp x14, x17, [sp, #16]
0x0000ffff998ab9cc: ldp x10, x13, [sp], #32
0x0000ffff998ab9d0: b 0xffff998aa718
It seems to be illegal to use the same register twice when loading into
a pair of registers. I verified that that was the problem, and not the
usage of zr (see below) that caused some weird encoding, by changing the
code to always generate stp/ldp with the same register:
=> 0x0000ffff757d22fc: ldp q20, q20, [sp, #32] ; Crash here as well
0x0000ffff757d2300: ldp q21, q21, [sp, #48]
0x0000ffff757d2304: ldp q22, q22, [sp, #64]
The code that generates this instruction is MacroAssembler::push_fp,
which spills the necessary registers in pairs with stp/ldp calls. If the
number of registers to spill is odd it needs to deal with one of the
registers separately. This is done by adding a dummy register here:
2136 regs[count++] = zr->encoding_nocheck();
2137 count &= ~1; // Only push an even number of regs
This scheme seems to work for the normal registers
(MacroAssembler::push), but the usage of zr seems dubious when we're
dealing with the fp/simd version of stp/ldp.
My proposed patch replaces the stp/ldp for the odd numbered register
with the single-register versions: str/ldr. I make sure to keep the
stack 16 bytes aligned by still bumping 16 bytes, but skipping the
store/load to the second 8 bytes half.
Note that right now MacroAssembler::push_fp is only used by ZGC.
This fixes the crash. I've run this code through jtreg groups :tier1,
tier2, tier3, and an Oracle-internal stress suite without any new problems.
The smallest reproducer I have is:
make -C ../build/fastdebug test TEST=test/jdk/java/util/concurrent/
Does this look OK?
More information about the hotspot-gc-dev