[aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend support

Andrew Dinn adinn at redhat.com
Mon Aug 17 09:16:47 UTC 2020

Hi Pengfei,

On 17/08/2020 07:00, Ningsheng Jian wrote:
> Thanks a lot for the review! Sorry for the late reply, as I was on
> vacation last week. And thanks to Pengfei and Joshua for helping
> clarifying some details in the patch.

Yes, they did a very good job of answering most of the pending questions.

>> I also eyeballed /some/ of the generated code to check that it looked
>> ok. I'd really like to be able to do that systematically for a
>> comprehensive test suite that exercised every rule but I only had the
>> machine for a few days. This really ought to be done as a follow-up to
>> ensure that all the rules are working as expected.
> Yes, we would expect Pengfei's OptoAssembly check patch can get merged
> in future.

I'm fine with that as a follow-up patch if you raise a JIRA for it.

>> I am not clear why you are choosing to re-init ptrue after certain JVM
>> runtime calls (e.g. when Z calls into the runtime) and not others e.g.
>> when we call a JVM_ENTRY. Could you explain the rationale you have
>> followed here?
> We do the re-init at any possible return points to c2 code, not in any
> runtime c++ functions, which will reduce the re-init calls.
> Actually I found those entries by some hack of jvm. In the hacky code
> below we use gcc option -finstrument-functions to build hotspot. With
> this option, each C/C++ function entry/exit will call the instrument
> functions we defined. In instrument functions, we clobber p7 (or other
> reg for test) register, and in c2 function return we verify that p7 (or
> other reg) has been reinitialized.
> http://cr.openjdk.java.net/~njian/8231441/0001-RFC-Block-one-caller-save-register-for-C2.patch

Nice work. It's very good to have that documented. I'm willing to accept
i) that this has found all current cases and ii) that the verify will
catch any cases that might get introduced by future changes (e.g. the
callout introduced by ZGC that you mention below). As the above mot say
there is a slim chance this might have missed some cases but I think it
is pretty unlikely.

>> Specific Comments (register allocator webrev):
>> aarch64.ad:97-100
>> Why have you added a reg_def for R8 and R9 here and also to alloc_class
>> chunk0 at lines 544-545? They aren't used by C2 so why define them?
> I think Pengfei has helped to explain that. I will either add clear
> comments or rename the register name as you suggested.

Ok, good.

> As Joshua clarified, we are also working on predicate scalable reg,
> which is not in this patch. Thanks for the suggestion, I will try to
> refactor this a bit.

Ok, I'll wait for an updated patch. Are you planning to include the
scalable predicate reg code as part of this patch? I think that would be
better as it would help to clarify the need to distinguish vector regs
as a subset of scalable regs.

>> zBarrierSetAssembler_aarch64.cpp:434
>> Can you explain why we need to check p7 here and not do so in other
>> places where we call into the JVM? I'm not saying this is wrong. I just
>> want to know how you decided where re-init of p7 was needed.
> Actually I found this by my hack patch above while running jtreg tests.
> The stub slowpath here can be a c++ function.

Yes, good catch.

>> superword.cpp:97
>> Does this mean that is someone sets the maximum vector size to a
>> non-power of two, such as 384, all superword operations will be
>> bypassed? Including those which can be done using NEON vectors?
> Current SLP vectorizer only supports power-of-2 vector size. We are
> trying to work out a new vectorizer to support all SVE vector sizes, so
> we would expect a size like 384 could go to that path. I tried current
> patch on a 512-bit SVE hardware which does not support 384-bit:
> $ java -XX:MaxVectorSize=16 -version # (32 and 64 are the same)
> openjdk version "16-internal" 2021-03-16
> $ java -XX:MaxVectorSize=48 -version
> OpenJDK 64-Bit Server VM warning: Current system only supports max SVE
> vector length 32. Set MaxVectorSize to 32
> (Fallbacks to 32 and issue a warning, as the prctl() call returns 32
> instead of unsupported 48:
> https://www.kernel.org/doc/Documentation/arm64/sve.txt)
> Do you think we need to exit vm instead of warning and fallbacking to 32
> here?

Yes, I think a vm exit would probably be a better choice.


Andrew Dinn
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill

More information about the aarch64-port-dev mailing list