RFR: JDK-8203172: Primitive heap access for interpreter BarrierSetAssembler/aarch64

Roman Kennke rkennke at redhat.com
Mon Jun 4 19:42:35 UTC 2018

Am 04.06.2018 um 18:43 schrieb Erik Österlund:
> Hi Roman,
> On 2018-06-04 17:24, Roman Kennke wrote:
>> Ok, right. Very good catch!
>> This should do it, right? Sorry, I couldn't easily make an incremental
>> diff:
>> http://cr.openjdk.java.net/~rkennke/JDK-8203172/webrev.01/
> Unfortunately, I think there is one more problem for you.
> The signal handler is supposed to catch SIGSEGV caused by speculative
> loads shot from the fantastic jni fast get field code. But it currently
> expects an exact PC match:
> address JNI_FastGetField::find_slowcase_pc(address pc) {
>   for (int i=0; i<count; i++) {
>     if (speculative_load_pclist[i] == pc) {
>       return slowcase_entry_pclist[i];
>     }
>   }
>   return (address)-1;
> }
> This means that the way this is written now, speculative_load_pclist
> registers the __ pc() right before the access_load_at call. This puts
> constraints on whatever is done inside of access_load_at to only
> speculatively load on the first assembled instruction.
> If you imagine a scenario where you have a GC with Brooks pointers that
> also uncommits memory (like Shenandoah I presume), then I imagine you
> would need something more here. If you start with a forwarding pointer
> load, then that can trap (which is probably caught by the exact PC
> match). But then there will be a subsequent load of the value in the
> to-space object, which will not be protected. But this is also loaded
> speculatively (as the subsequent safepoint counter check could
> invalidate the result), and could therefore crash the VM unless
> protected, as the signal handler code fails to recognize this is a
> speculative load from jni fast get field.
> I imagine the solution to this would be to let speculative_load_pclist
> specify a range for fuzzy SIGSEGV matching in the signal handler, rather
> than an exact PC (i.e. speculative_load_pclist_start and
> speculative_load_pclist_end). That would give you enough freedom to use
> Brooks pointers in there. Sometimes I wonder if the lengths we go to
> maintain jni fast get field is *really* worth it.

I are probably right in general. But I also think we are fine with
Shenandoah. Both the fwd ptr load and the field load are constructed
with the same base operand. If the oop is NULL (or invalid memory) it
will blow up on fwdptr load just the same as it would blow up on field
load. We maintain an invariant that the fwd ptr of a valid oop results
in a valid (and equivalent) oop. I therefore think we are fine for now.
Should a GC ever need anything else here, I'd worry about it then. Until
this happens, let's just hope to never need to touch this code again ;-)

>> Unfortunately, I cannot really test it because of:
>> http://mail.openjdk.java.net/pipermail/aarch64-port-dev/2018-May/005843.html
> That is unfortunate. If I were you, I would not dare to change anything
> in jni fast get field without testing it - it is very error prone.

Yeah. I guess I'll just wait with testing until this is resolved. Or
else resolve it myself.

Can I consider this change reviewed by you?


> Thanks,
> /Erik
>> Roman
>>> Hi Roman,
>>> Oh man, I was hoping I would never have to look at jni fast get field
>>> again. Here goes...
>>>   93   speculative_load_pclist[count] = __ pc();   // Used by the
>>> segfault handler
>>>   94   __ access_load_at(type, IN_HEAP, noreg /* tos: r0/v0 */,
>>> Address(robj, roffset), noreg, noreg);
>>>   95
>>> I see that here you load straight to tos, which is r0 for integral
>>> types. But r0 is also c_rarg0. So it seems like if after loading the
>>> primitive to r0, the subsequent safepoint counter check fails, then the
>>> code will revert back to a slowpath call, but this time with c_rarg0
>>> clobbered, leading to a broken JNI env pointer being passed in to the
>>> slow path C function. That does not seem right to me.
>>> This JNI fast get field code is so error prone. :(
>>> Unfortunately, the proposed API can not load floating point numbers to
>>> anything but ToS, which seems like a problem in the jni fast get field
>>> code.
>>> I think to make this work properly, you need to load integral types to
>>> result and not ToS, so that you do not clobber r0, and rely on ToS being
>>> v0 for floating point types, which does not clobber r0. That way we can
>>> dance around the issue for now I suppose.
>>> Thanks,
>>> /Erik
>>> On 2018-05-14 22:23, Roman Kennke wrote:
>>>> Similar to x86
>>>> (http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-May/032114.html)
>>>> here comes the primitive heap access changes for aarch64:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8203172/webrev.00/
>>>> Some notes:
>>>> - array access used to compute base_obj + index, and then use indexed
>>>> addressing with base_offset. This means we cannot get base_obj in the
>>>> BarrierSetAssembler API, but we need that, e.g. for resolving the
>>>> target
>>>> object via forwarding pointer. I changed
>>>> (base_obj+index)+base_offset to
>>>> base_obj+(index+base_offset) in all the relevant places.
>>>> - in jniFastGetField_aarch64.cpp, we are using a trick to ensure
>>>> correct
>>>> ordering field-load with the load of the safepoint counter: we make
>>>> them
>>>> address dependend. For float and double loads this meant to load the
>>>> value as int/long, and then later moving those into v0. This doesn't
>>>> work when going through the BarrierSetAssembler API: it loads straight
>>>> to v0. Instead I am inserting a LoadLoad membar for float/double (which
>>>> should be rare enough anyway).
>>>> Other than that it's pretty much analogous to x86.
>>>> Testing: no regressions in hotspot/tier1
>>>> Can I please get a review?
>>>> Thanks, Roman

More information about the hotspot-dev mailing list