[aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend support
Magnus Ihse Bursie
magnus.ihse.bursie at oracle.com
Wed Aug 19 10:05:01 UTC 2020
On 2020-08-19 11:53, Ningsheng Jian wrote:
> Hi Andrew,
> I have updated the patch based on the review comments. Would you mind
> taking another look? Thanks!
Build changes look good. Thank you for remembering to cc build-dev!
This is maybe not relevant, but I was surprised to find
src/hotspot/cpu/aarch64/aarch64-asmtest.py, because a) it's python code,
and b) the name implies that it is a test, even though that it resides
in src. Is this really proper?
> Also add build-dev, as there's a makefile change.
> And the split parts:
> 1) SVE feature detection:
> 2) c2 register allocation:
> 3) SVE c2 backend:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8231441
> CSR: https://bugs.openjdk.java.net/browse/JDK-8248742
> JTreg tests are still running, and so far no new failure found.
> On 8/17/20 5:16 PM, Andrew Dinn wrote:
>> Hi Pengfei,
>> On 17/08/2020 07:00, Ningsheng Jian wrote:
>>> Thanks a lot for the review! Sorry for the late reply, as I was on
>>> vacation last week. And thanks to Pengfei and Joshua for helping
>>> clarifying some details in the patch.
>> Yes, they did a very good job of answering most of the pending
>>>> I also eyeballed /some/ of the generated code to check that it looked
>>>> ok. I'd really like to be able to do that systematically for a
>>>> comprehensive test suite that exercised every rule but I only had the
>>>> machine for a few days. This really ought to be done as a follow-up to
>>>> ensure that all the rules are working as expected.
>>> Yes, we would expect Pengfei's OptoAssembly check patch can get merged
>>> in future.
>> I'm fine with that as a follow-up patch if you raise a JIRA for it.
>>>> I am not clear why you are choosing to re-init ptrue after certain JVM
>>>> runtime calls (e.g. when Z calls into the runtime) and not others e.g.
>>>> when we call a JVM_ENTRY. Could you explain the rationale you have
>>>> followed here?
>>> We do the re-init at any possible return points to c2 code, not in any
>>> runtime c++ functions, which will reduce the re-init calls.
>>> Actually I found those entries by some hack of jvm. In the hacky code
>>> below we use gcc option -finstrument-functions to build hotspot. With
>>> this option, each C/C++ function entry/exit will call the instrument
>>> functions we defined. In instrument functions, we clobber p7 (or other
>>> reg for test) register, and in c2 function return we verify that p7 (or
>>> other reg) has been reinitialized.
>> Nice work. It's very good to have that documented. I'm willing to accept
>> i) that this has found all current cases and ii) that the verify will
>> catch any cases that might get introduced by future changes (e.g. the
>> callout introduced by ZGC that you mention below). As the above mot say
>> there is a slim chance this might have missed some cases but I think it
>> is pretty unlikely.
>>>> Specific Comments (register allocator webrev):
>>>> Why have you added a reg_def for R8 and R9 here and also to
>>>> chunk0 at lines 544-545? They aren't used by C2 so why define them?
>>> I think Pengfei has helped to explain that. I will either add clear
>>> comments or rename the register name as you suggested.
>> Ok, good.
>>> As Joshua clarified, we are also working on predicate scalable reg,
>>> which is not in this patch. Thanks for the suggestion, I will try to
>>> refactor this a bit.
>> Ok, I'll wait for an updated patch. Are you planning to include the
>> scalable predicate reg code as part of this patch? I think that would be
>> better as it would help to clarify the need to distinguish vector regs
>> as a subset of scalable regs.
>>>> Can you explain why we need to check p7 here and not do so in other
>>>> places where we call into the JVM? I'm not saying this is wrong. I
>>>> want to know how you decided where re-init of p7 was needed.
>>> Actually I found this by my hack patch above while running jtreg tests.
>>> The stub slowpath here can be a c++ function.
>> Yes, good catch.
>>>> Does this mean that is someone sets the maximum vector size to a
>>>> non-power of two, such as 384, all superword operations will be
>>>> bypassed? Including those which can be done using NEON vectors?
>>> Current SLP vectorizer only supports power-of-2 vector size. We are
>>> trying to work out a new vectorizer to support all SVE vector sizes, so
>>> we would expect a size like 384 could go to that path. I tried current
>>> patch on a 512-bit SVE hardware which does not support 384-bit:
>>> $ java -XX:MaxVectorSize=16 -version # (32 and 64 are the same)
>>> openjdk version "16-internal" 2021-03-16
>>> $ java -XX:MaxVectorSize=48 -version
>>> OpenJDK 64-Bit Server VM warning: Current system only supports max SVE
>>> vector length 32. Set MaxVectorSize to 32
>>> (Fallbacks to 32 and issue a warning, as the prctl() call returns 32
>>> instead of unsupported 48:
>>> Do you think we need to exit vm instead of warning and fallbacking
>>> to 32
>> Yes, I think a vm exit would probably be a better choice.
>> Andrew Dinn
>> Red Hat Distinguished Engineer
>> Red Hat UK Ltd
>> Registered in England and Wales under Company Registration No. 03798903
>> Directors: Michael Cunningham, Michael ("Mike") O'Neill
More information about the build-dev