RFR(XS): JDK-8199780: SetMemory0 and CopyMemory0 in unsafe.cpp need to resolve their operands
rkennke at redhat.com
Tue Mar 20 15:40:32 UTC 2018
Am 20.03.2018 um 12:00 schrieb Erik Österlund:
> Hi Roman,
> On 2018-03-20 11:36, Roman Kennke wrote:
>> Same reason as splitting resolve -> resolve_for_read/resolve_for_write
>> in other routines: being able to distinguish read and write access.
>> Also, I'd rather be careful to put this stuff in central places that
>> might over-cover it.
> It sounds like the motivation for this in my opinion more fragile call
> site chasing code is optimization.
> What is the performance difference? Has this showed up in any profiles?
> Whenever robustness is traded for performance, it would be great to have
> some understanding about how much performance was lost.
I don't have numbers. But it is not hard to see that copying potentially
large arrays twice has some impact. It may only really matter in
interpreter and C1, because C2 would most likely intrinsify anything
that would show up in profiles, but this would still amount to startup
time penalty I would think. I don't really intend to trade robustness
for performance: my goal is to make a robust API that also allows GCs to
>> I've missed Unsafe_CopySwapMemory0, good find (has this been added
>> recently?). I'll add Access calls there, and meditate a little bit how
>> to put this into a more central place to avoid having to fix this for
>> every code change in unsafe.cpp ;-)
> No, this has been around for at least 2 years. In this file, address
> resolution is consistently done with index_oop_from_field_offset_long,
> so I think if you want to play it safe (and I know I would), then I
> would put the Access<>::resolve in there, and leave the rest of the call
> sites the way they are.
It seems that index_oop_from_field_offset_long() is otherwise only used
for the non-heap paths (i.e. when p == NULL). Well I guess it's ok to
put the resolve() there for now. Are those methods also meant to do
native memory copies/fills when getting src/dst == NULL ? If so, then
this new revision fixes the resolve() for those cases as well.
Assuming we can reach an agreement about JDK-8199801: Finer grained
primitive arrays bulk access barriers, I'd probably also split
index_oop_from_field_offset_long() into versions for read and write.
Might that be acceptable?
More information about the hotspot-dev