[intrinsics]: performance before after (String::format)
vicente.romero at oracle.com
Mon Feb 25 13:14:47 UTC 2019
On 2/24/19 8:13 AM, Claes Redestad wrote:
> On 2019-02-23 01:36, Vicente Romero wrote:
>> On 2/22/19 4:59 PM, Alex Buckley wrote:
>>> On 2/22/2019 1:46 PM, Vicente Romero wrote:
>>>> To complete the picture please find attached the performance
>>>> results for
>>>> Objects.hash for a number of experiments. In general they don't
>>>> look as
>>>> good as the ones for String::format. In general it seems like there is
>>>> no much gain unless the number of parameters is large and all the
>>>> parameters are constants. This is understandable because the compiler
>>>> generates an LDC of the result. In all other cases the performance is
>>>> just a bit better or a lot worst.
>>> Intrinsified Vanilla Speedup
>>> testHash1IntVariable 42564 42799 1x
>>> testHash2IntVariables 41573 9019 5x
>>> testHash100IntVariables 4 27 0.15x
>>> With a large number of parameters, you might hope that avoiding
>>> double boxing (int -> Integer -> array store) gives us some win,
>>> even for non-constant arguments. But something is happening that
>>> kills the speedup, do you know what it is?
>> I'm doing some research on this, my assumption is that HS was able to
>> recognize the old pattern but it has issues with the MH graph being
>> generated now. It could be that some nodes in the graph are more
>> opaque. But this is just my opinion
> If I were to guess you're hitting some JIT limit - likely inlining-
> related - which cause a miscompilation at some point.. I've been
> mulling over whether we in general need to build in heuristics into our
> BSMs to generate simpler shapes once the number of arguments grow,
> e.g., only specialize for the first N arguments and emit a call to
> Objects.hash(Object) for the remainder.
right that should be playing a great part here, choosing an N as
threshold could be a good compromise, the issue is that even for small
Ns the improvement is almost negligible and it is type dependent.
> What value N and how to gracefully downgrade to a simpler implementation
> is implementation dependent, and might even be chosen differently
> depending on whether you're optimizing for peak performance or
> startup/footprint, as a smaller N could reduce potential for a BSM to
> emit combinatorially explosivs MH graphs.
thanks for your evaluation, very helpful!
More information about the amber-dev