RFR(L): 8189793: [s390]: Improve String compress/inflate by exploiting vector instructions
martin.doerr at sap.com
Wed Oct 25 19:08:59 UTC 2017
thanks for working on vector-based enhancements and for providing this webrev.
-The changes in the assembler look good.
-It doesn't make sense to load constant len to a register and generate complex compare instructions for it and still to emit code for all cases. I assume that e.g. the 4 characters cases usually have a constant length. If so, much better code could be generated for them by omitting all the stuff around the simple instructions. (ppc64.ad already contains nodes for constant length of needle in indexOf rules.)
-Are you sure the prefetch instructions improve performance?
I remember that we had them in other String intrinsics but removed them again as they showed absolutely no performance gain.
-Comment: Using hardcoded vector registers is ok for now, but may need to get changed e.g. when using them for C2's SuperWord optimization.
-Comment: You could use the vperm instruction instead of vo+vn, but I'm ok with the current implementation because loading a mask is much more convenient than getting the permutation vector loaded (e.g. from constant pool or pc relative).
-So the new vector loop looks good to me.
-In my opinion, the size of all the generated cases should be in relationship to their performance benefit.
As intrinsics are not like stubs and may get inlined often, I can't get rid of the impression that generating so large code wastes valuable code cache space with questionable performance gain in real world scenarios.
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Schmidt, Lutz
Sent: Mittwoch, 25. Oktober 2017 12:02
To: hotspot-compiler-dev at openjdk.java.net
Subject: RFR(L): 8189793: [s390]: Improve String compress/inflate by exploiting vector instructions
I would like to request reviews for this s390-only enhancement:
Vector instructions, which have been available on System z for a while (since z13), promise noticeable performance improvements. This enhancement improves the String Compress and String Inflate intrinsics by exploiting vector instructions, when available. For long strings, up to 2x performance improvement has been observed in micro-benchmarks.
Special care was taken to preserve good performance for short strings. All examined workloads showed a high ratio of short and very short strings.
Dr. Lutz Schmidt | SAP JVM | PI SAP CP Core | T: +49 (6227) 7-42834
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-compiler-dev