Berg, Michael C
michael.c.berg at intel.com
Thu Apr 28 02:39:41 UTC 2016
Roland, for superword.cpp you only need this one line as change, which I have tested and for which has no negative side effects on x86. It will address the issue(oldly enough, its where we started):
Line 201 int max_vector = Matcher::max_vector_size(T_BYTE);
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Berg, Michael C
Sent: Wednesday, April 27, 2016 2:59 PM
To: John Rose <john.r.rose at oracle.com>; rwestrel at redhat.com
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: RE: SuperWord::unrolling_analysis() question
John, it is pretty much that issue(unrolling for the available supported vector), I am testing some changes now for Roland.
From: John Rose [mailto:john.r.rose at oracle.com]
Sent: Wednesday, April 27, 2016 2:55 PM
To: rwestrel at redhat.com
Cc: Berg, Michael C <michael.c.berg at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: SuperWord::unrolling_analysis() question
It is reasonable to look ahead into the loop to find the largest applicable vector size, before choosing an unroll factor.
A loop which works on bytes and doubles at the same time will want to unroll only up to vector-of-double.
But a loop which works only on bytes will want to unroll more.
Is that what we are talking about here?
> On Apr 27, 2016, at 8:53 AM, Roland Westrelin <rwestrel at redhat.com> wrote:
> Hi Michael,
> Thanks for the answer.
>> The answer could be conditional if we had a machines with enough byte
>> or short components to make vectors with, I chose INT as it is the
>> current consistent minimum configuration for complete vector mapping.
>> The best answer would be to create some code which mines the common
>> type used in the current loops expressions, but I think we would be
>> stuck with two passes over the code, the first to bind the common
>> type, the second for finding the optimal sub vector mapping. Or
>> possibly moving the question to the machine layer as a query, where
>> compiler writers choose the minimum consistent configuration based on
>> current info on the machine we compile on.
> Would two passes like sketched here:
> would do the job?
More information about the hotspot-compiler-dev