RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86

Tobias Hartmann thartmann at openjdk.java.net
Mon Sep 27 06:40:06 UTC 2021

On Tue, 21 Sep 2021 21:58:48 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to work also for ASCII encoding, which makes for example the `UTF_8$Encoder` perform on par with (or outperform) similarly getting charset encoded bytes from a String. The former took a small performance hit in JDK 9, and the latter improved greatly in the same release.
> Extending the `EncodeIsoArray` intrinsics on other platforms should be possible, but I'm unfamiliar with the macro assembler in general and unlike the x86 intrinsic they don't use a simple vectorized mask to implement the latin-1 check. For example aarch64 seem to filter out the low bytes and then check if there's any bits set in the high bytes. Clever, but very different to the 0xFF80 2-byte mask that an ASCII test wants.

Very nice. The changes look good to me, just added some minor comments.

Should we remove the "iso" part from the method/class names?

src/hotspot/cpu/x86/x86_32.ad line 12218:

> 12216: instruct encode_ascii_array(eSIRegP src, eDIRegP dst, eDXRegI len,
> 12217:                           regD tmp1, regD tmp2, regD tmp3, regD tmp4,
> 12218:                           eCXRegI tmp5, eAXRegI result, eFlagsReg cr) %{

Indentation is wrong.

src/hotspot/cpu/x86/x86_32.ad line 12223:

> 12221:   effect(TEMP tmp1, TEMP tmp2, TEMP tmp3, TEMP tmp4, USE_KILL src, USE_KILL dst, USE_KILL len, KILL tmp5, KILL cr);
> 12222: 
> 12223:   format %{ "Encode array $src,$dst,$len -> $result    // KILL ECX, EDX, $tmp1, $tmp2, $tmp3, $tmp4, ESI, EDI " %}

You might want to change the opto assembly comment to "Encode ascii array" (and to "Encode iso array" above). Same on 64-bit.

src/hotspot/share/opto/intrinsicnode.hpp line 171:

> 169: 
> 170: //------------------------------EncodeISOArray--------------------------------
> 171: // encode char[] to byte[] in ISO_8859_1

Comment should be adjusted to `... in ISO_8859_1 or ASCII`.


Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5621

More information about the core-libs-dev mailing list