RFR: JDK-8198285: More consistent Access API for arraycopy

Erik Österlund erik.osterlund at oracle.com
Thu Apr 26 15:29:38 UTC 2018

Hi Roman,

Sure. That is another option. I wonder what the callsites will look like 
though? Will users of the API have to create their address wrapper 
objects explicitly at the callsite and pass them in to the Access call? 
I am also worried that wrapping the addresses in these wrapper objects 
type erases the element type, which might come in handy when we perform 
atomic arraycopy, so that we know what granularity the atomicity needs 
to be on. And either that goes in as explicit template parameters on the 
array type (ugh), or explicitly on the arraycopy function, making it a 
template type qualified template method, which requires the template 
keyword, like this:

length_in_bytes = ...
arraycopy<jshort>(NativeAddress(ptr), HeapArray(obj, index), 

I tend to think it would be easier and more clear to write:

ArrayAccess<ARRAYCOPY_ATOMIC>::arraycopy_from_native(ptr, obj, index, 

Each required parameter is clear when you read the API.

But I am open to suggestions of course.


On 2018-04-25 17:59, Roman Kennke wrote:
>> 1) Resolve the addresses as we do today, but you think you get better
>> Shenandoah performance if we either
>> 2) Have 3 different access calls, (from heap to native, from native to
>> heap, from heap to heap) for copying, or
>> 3) Have 1 access call that can encode the above 3 variants, but looks
>> ugly at the call sites.
> There's also the idea to pass Address arguments to arraycopy, one for
> src, one for dst, and have 2 different subclasses: one for obj+offset
> (heap access) and one with direct pointer (raw). Your comment gave me
> the idea to also provide arrayOop+idx. This would look clean on the
> caller side I think.
> It would also be useful on the GC side: BarrierSets would specialize
> only in the variants that they are interested in, for example, in case
> of Shenandoah:
> 1. arraycopy(HeapAddress,HeapAddress) Java->Java
> 2. arraycopy(HeapAddress,RawAddress)  Java->native
> 3. arraycopy(RawAddress,HeapAddress)  native->Java
> other barriersets can ignore the exact type and only call the args
> address->resolve() or so to get the actual raw address.
> This would *also* be beneficial for the other APIs: instead of having
> all the X() and X_at() variants, we can just use one X variant that
> either takes RawAddress or HeapAddress.
> I made a little (not yet compiling/working) prototype of this a while ago:
> http://cr.openjdk.java.net/~rkennke/JDK-8199801-2.patch
> What do you think? Would it make sense to go further down that road?
> Roman
>> You clearly went for 3, which leaves the callsites looking rather hard
>> to read. It is for example not obvious for me what is going on here
>> (javaClasses.cpp line 313):
>> HeapAccess<>::arraycopy<jbyte>(NULL, 0, reinterpret_cast<const
>> jbyte*>(utf8_str), value(h_obj()),
>> typeArrayOopDesc::element_offset<jbyte>(0), NULL, length);
>> ...without looking very carefully at the long list of arguments encoding
>> what is actually going on (copy from native to the heap). What is worse
>> is that this will probably not compile without adding the template
>> keyword to the call (since you have a qualified template member function
>> behind a template class), like this:
>> HeapAccess<>::template arraycopy<jbyte>(NULL, 0, reinterpret_cast<const
>> jbyte*>(utf8_str), value(h_obj()),
>> typeArrayOopDesc::element_offset<jbyte>(0), NULL, length);
>> ...which as a public API leaves me feeling a bit like this:     :C
>> May I suggest adding an array access helper. The idea is to keep a
>> single call through Access, and a single backend point for array copy,
>> but let the helper provide the three different types of copying as
>> different functions, so that the call sites still look pretty and easy
>> to read and follow.
>> This helper could additionally have load_at, and store_at use array
>> indices as opposed to offsets, and hence hide the offset calculations we
>> perform today (typically involving checking if we are using compressed
>> oops or not).
>> I am thinking something along the lines of
>> ArrayAccess<>::arraycopy_to_native(readable_arguments),
>> ArrayAccess<>::arraycopy_from_native(readable_arguments),
>> ArrayAccess<>::arraycopy(readable_arguments), which translates to some
>> form of Access<>::arraycopy(unreadable_arguments). And for example
>> ArrayAccess<>::load_at(obj, index) would translate to some kind of
>> HeapAccess<IN_HEAP_ARRAY>::load_at(obj, offset_for_index(index)) as a
>> bonus, making everyone using the API jump with happiness.
>> What do you think about this idea? Good or bad? I guess the question is
>> whether this helper should be in access.hpp, or somewhere else (like in
>> arrayOop). Thoughts are welcome.
>> Thanks,
>> /Erik
>> On 2018-04-11 19:54, Roman Kennke wrote:
>>>    Currently, the arraycopy API in access.hpp gets the src and dst oops,
>>> plus the src and dst addresses. In order to be most useful to garbage
>>> collectors, it should receive the src and dst oops together with the src
>>> and dst offsets instead, and let the Access API / GC calculate the src
>>> and dst addresses.
>>> For example, Shenandoah needs to resolve the src and dst objects for
>>> arraycopy, and then apply the corresponding offsets. With the current
>>> API (obj+ptr) it would calculate the ptr-diff from obj to ptr, then
>>> resolve obj, then re-add the calculate ptr-diff. This is fragile because
>>> we also may resolve obj in the runtime before calculating ptr (e.g. via
>>> arrayOop::base()). If we then pass in the original obj and a ptr
>>> calculated from another copy of the same obj, the above resolution logic
>>> would not work. This is currently the case for obj-arraycopy.
>>> I propose to change the API to accept obj+offset, in addition to ptr for
>>> both src and dst. Only one or the other should be used. Heap accesses
>>> should use obj+offset and pass NULL for raw-ptr, off-heap accesses (or
>>> heap accesses that are already resolved.. use with care) should pass
>>> NULL+0 for obj+offset and the raw-ptr. Notice that this also allows the
>>> API to be used for Java<->native array bulk transfers.
>>> An alternative would be to break the API up into 4 variants:
>>> Java->Java transfer:
>>> arraycopy(oop src, size_t src_offs, oop dst, size_t dst_offs, size_t len)
>>> Java->Native transfer:
>>> arraycopy(oop src, size_t src_offs, D* raw_dst, size_t len)
>>> Native->Java transfer:
>>> arraycopy(S* src_raw, oop dst, size_t dst_offs, size_t len)
>>> 'Unsafe' transfer:
>>> arraycopy(S* src_raw, D* dst_raw, size_t len)
>>> But that seemed to be too much boilerplate copy+pasting for my taste.
>>> (See how having this overly complicated template layer hurts us?)
>>> Plus, I had a better idea: instead of accepting oop+offset OR T* for
>>> almost every Access API, we may want to abstract that and take an
>>> Address type argument, which would be either HeapAddress(obj, offset) or
>>> RawAddress(T* ptr). GCs may then just call addr->address() to get the
>>> actual address, or specialize for HeapAddress variants and resolve the
>>> objs and then resolve the address. This would also allow us to get rid
>>> of almost half of the API (all the *_at variants would go) and some
>>> other simplifications. However, this seemed to explode the scope of this
>>> RFE, and would be better handled in another RFE.
>>> This changes makes both typeArrayKlass and objArrayKlass use the changed
>>> API, plus I identified all (hopefully) places where we do bulk
>>> Java<->native array transfers and make them use the API too. Gets us rid
>>> of a bunch of memcpy calls :-)
>>> Please review the change:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.00/
>>> Thanks, Roman

More information about the hotspot-gc-dev mailing list