Bounds checks with unsafe array access
dl at cs.oswego.edu
Sat Sep 13 14:33:55 UTC 2014
On 09/12/2014 01:04 PM, John Rose wrote:
> On Sep 12, 2014, at 5:40 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>> On Sep 11, 2014, at 10:51 PM, John Rose <john.r.rose at oracle.com> wrote:
>>> In some cases of highly tuned code the author can work with the JIT engineer to insert the right code shapes from the start.
>>> The "test length against zero" code shape is one of these, and can often be merged with a check that is required any way (e.g., is the data structure logically empty?).
>>> For example:
>> A great example, i was just recently looking at that :-)
This is another case of me manually optimizing library code bottlenecks
>> My understanding is that check ("(n = tab.length) > 0") is in place solely to generate more efficient bounds checks, since it should always return true (table is either null or non-null with a length > 0). If/when JDK-8003585 gets fixed we might be able to update the code and remove that check (or turn it into an assert if it did not interfere with inlining...).
> Actually, that would be a step backwards. It is best to initialize HashMap.table to a constant zero-length array:
> If the table is never null, then the null check becomes mergeable into the following load instruction.
People keep telling me this, but I have yet to find a bottleneck
case involving power-of-two-sized arrays where it generates faster code.
Generally, explicitly skipping rather than trapping (or inviting
hotspot to trap) in this case seems fastest. Of course, I only do this when
it can be statically validated that either the array ref is null or the
array has positive size. Hotspot cannot locally perform this validation,
so the best I can do (which is not bad) is to use perfectly predictable
skipped branches. Except that if I need to use Unsafe for sake of ordered
access anyway, I usually just omit them, because there's reason to fool
myself into generating empty branches. Unfortunately, this can't apply to
Among the underlying issues is that there is no good analog of
an uncommon-trap for a bad array index.
Doing better would require hotspot perform non-local analyses.
Here, for example, you could use a whole-program points-to analysis
to determine that all array construction sites are known, and then
that sizes are never zero. It seems plausible that some restricted
forms of load-time analysis will be needed to handle
specialized and reified generics, that might be leveraged here.
In the mean time, I'll work with Paul to find ways of re-inserting
zero-checks of power-of-two array constructions in j.u.c code, in
preparation for VarHandle approach. There are only a few classes
that do this. In the two main cases, ForkJoin and ConcurrentHashMap,
you need careful coding to make reads (even of array.length) fast
yet correct amid all of the volatile/fence constraints, while still
conveying skip vs trap effects. A few quick experiments showed 10%
(overall!) throughput variation on some FJ tests across a few choices,
but that some of the ways to it seem to work well.
Other high-performance Unsafe-programmers may need to put in some
hard work to find similar coding tricks when Unsafe because unavailable.
More information about the hotspot-compiler-dev