[9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Tue Jan 20 19:10:44 UTC 2015

>>> You forgot to mark Opaque4Node as macro node. I would suggest to base it
>>> on Opaque2Node then you will get some methods from it.
>> Do I really need to do so? I expect it to go away during IGVN pass
>> right after parsing is over. That's why I register
>> the node for igvn in LibraryCallKit::inline_profileBranch(). Changes
>> in macro.cpp & compile.cpp are leftovers from the
>> version when Opaque4 was macro node. I plan to remove them.
> I see, this is why you did not inherited it. Okay. I would suggest to
> leave an assert in compile.cpp to make sure it is not left.
> I found typo when looked today (should be '&&'):
> + Node *Opaque4Node::Ideal(PhaseGVN *phase, bool can_reshape) {
> +   if (can_reshape & _delay_removal) {
Good catch! Fixed in the latest version:

Best regards,
Vladimir Ivanov

> Thanks,
> Vladimir
>> Best regards,
>> Vladimir Ivanov
>>> On 1/16/15 9:16 AM, Vladimir Ivanov wrote:
>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
>>>> https://bugs.openjdk.java.net/browse/JDK-8063137
>>>> After GuardWithTest (GWT) LambdaForms became shared, profile pollution
>>>> significantly distorted compilation decisions. It affected inlining and
>>>> hindered some optimizations. It causes significant performance
>>>> regressions for Nashorn (on Octane benchmarks).
>>>> Inlining was fixed by 8059877 [1], but it didn't cover the case when a
>>>> branch is never taken. It can cause missed optimization opportunity,
>>>> and
>>>> not just increase in code size. For example, non-pruned branch can
>>>> break
>>>> escape analysis.
>>>> Currently, there are 2 problems:
>>>>    - branch frequencies profile pollution
>>>>    - deoptimization counts pollution
>>>> Branch frequency pollution hides from JIT the fact that a branch is
>>>> never taken. Since GWT LambdaForms (and hence their bytecode) are
>>>> heavily shared, but the behavior is specific to MethodHandle,
>>>> there's no
>>>> way for JIT to understand how particular GWT instance behaves.
>>>> The solution I propose is to do profiling in Java code and feed it to
>>>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
>>>> profiling info is stored. Once JIT kicks in, it can retrieve these
>>>> counts, if corresponding MethodHandle is a compile-time constant
>>>> (and it
>>>> is usually the case). To communicate the profile data from Java code to
>>>> JIT, MethodHandleImpl::profileBranch() is used.
>>>> If GWT MethodHandle isn't a compile-time constant, profiling should
>>>> proceed. It happens when corresponding LambdaForm is already shared,
>>>> for
>>>> newly created GWT MethodHandles profiling can occur only in native code
>>>> (dedicated nmethod for a single LambdaForm). So, when compilation of
>>>> the
>>>> whole MethodHandle chain is triggered, the profile should be already
>>>> gathered.
>>>> Overriding branch frequencies is not enough. Statistics on
>>>> deoptimization events is also polluted. Even if a branch is never
>>>> taken,
>>>> JIT doesn't issue an uncommon trap there unless corresponding bytecode
>>>> doesn't trap too much and doesn't cause too many recompiles.
>>>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
>>>> sees it on some method, Compile::too_many_traps &
>>>> Compile::too_many_recompiles for that method always return false. It
>>>> allows JIT to prune the branch based on custom profile and recompile
>>>> the
>>>> method, if the branch is visited.
>>>> For now, I wanted to keep the fix very focused. The next thing I
>>>> plan to
>>>> do is to experiment with ignoring deoptimization counts for other
>>>> LambdaForms which are heavily shared. I already saw problems caused by
>>>> deoptimization counts pollution (see JDK-8068915 [2]).
>>>> I plan to backport the fix into 8u40, once I finish extensive
>>>> performance testing.
>>>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
>>>> Octane).
>>>> Thanks!
>>>> PS: as a summary, my experiments show that fixes for 8063137 & 8068915
>>>> [2] almost completely recovers peak performance after LambdaForm
>>>> sharing
>>>> [3]. There's one more problem left (non-inlined MethodHandle
>>>> invocations
>>>> are more expensive when LFs are shared), but it's a story for another
>>>> day.
>>>> Best regards,
>>>> Vladimir Ivanov
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877
>>>>      8059877: GWT branch frequencies pollution due to LF sharing
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703
>>>>      JEP 210: LambdaForm Reduction and Caching
>>>> _______________________________________________
>>>> mlvm-dev mailing list
>>>> mlvm-dev at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>> _______________________________________________
>>> mlvm-dev mailing list
>>> mlvm-dev at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

More information about the hotspot-compiler-dev mailing list