[9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

John Rose john.r.rose at oracle.com
Tue Jan 27 00:04:03 UTC 2015

On Jan 26, 2015, at 8:41 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> What do you think about the following version?
> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02 <http://cr.openjdk.java.net/~vlivanov/8063137/webrev.02>
> As you suggested, I reified MHI::profileBranch on LambdaForm level and removed @LambdaForm.Shared. My main concern about removing @Sharen was that profile pollution can affect the code before profileBranch call (akin to 8068915 [1]) and it seems it's the case: Gbemu (at least) is sensitive to that change (there's a 10% difference in peak performance between @Shared and has_injected_profile()).
> I can leave @Shared as is for now or remove it and work on the fix to the deoptimization counts pollution. What do you prefer?

Generic advice here:  It's better to leave it out, if in doubt.  If it has a real benefit, and we don't have time to make it clean, put it in and file a tracking bug to clean it up.

I re-read the change.  It's simpler and more coherent now.

I see one more issue which we should fix now, while we can.  It's the sort of thing which is hard to clean up later.

The two fields of the profileBranch array have obscure and inconsistent labelings.  It took me some hard thought and the inspection of three files to decide what "taken" and "not taken" mean in the C2 code that injects the profile.  The problem is that, when you look at profileBranch, all you see is an integer (boolean) argument and an array, and no clear indication about which array element corresponds to which argument value.  It's made worse by the fact that "taken" and "not taken" are not mentioned at all in the JDK code, which instead wires together the branches of selectAlternative without much comment.

My preferred formulation, for making things clearer:  Decouple the idea of branching from the idea of profile injection.  Name the intrinsic (yes, one more bikeshed color) "profileBoolean" (or even "injectBooleanProfile"), and use the natural indexing of the array:  0 (Java false) is a[0], and 1 (Java true) is a[1].  We might later extend this to work with "booleans" (more generally, small-integer flags), of more than two possible values, klasses, etc.

This line then goes away, and 'result' is used directly as the profile index:
+        int idx = result ? 0 : 1;

The ProfileBooleanNode should have an embedded (or simply indirect) array of ints which is a simple copy of the profile array, so there's no doubt about which count is which.

The parsing of the predicate that contains "profileBoolean" should probably be more robust, at least allowing for 'eq' and 'ne' versions of the test.  (C2 freely flips comparison senses, in various places.)  The check for Op_AndI must be more precise; make sure n->in(2) is a constant of the expected value (1).  The most robust way to handle it (but try this another time, I think) would be to make two temp copies of the predicate, substituting the occurrence of ProfileBoolean with '0' and '1', respectively; if they both fold to '0' and '1' or '1' and '0', then you take the indicated action.

I suggest putting the new code in Parse::dynamic_branch_prediction, which pattern-matches for injected profiles, into its own subroutine.  Maybe:
   bool use_mdo = true;
   if (has_injected_profile(btest, test, &taken, &not_taken)) {
     use_mdo = false;
   if (use_mdo) { ... old code

I see why you used the opposite order in the existing code:  It mirrors the order of the second and third arguments to selectAlternative.  But the JVM knows nothing about selectAlternative, so it's just confusing when reading the VM code to know which profile array element means what.

— John

P.S.  Long experience with byte-order bugs in HotSpot convinces me that if you are not scrupulously clear in your terms, when working with equal and opposite configuration pairs, you will have a long bug tail, especially if you have to maintain agreement about the configurations through many layers of software.  This is one of those cases.  The best chance to fix such bugs is not to allow them in the first place.  In the case of byte-order, we have "first" vs. "second", "MSB" vs. "LSB", and "high" vs. "low" parts of values, for values in memory and in registers, and all possible misunderstandings about them and their relation have probably happened and caused bugs.

More information about the core-libs-dev mailing list