RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API for Base64 decoding
Roger.Riggs at oracle.com
Wed Aug 19 18:20:13 UTC 2020
For changes obviously performance motivated, it is conventional to run a
JMH perf test to demonstate
the improvement and prove it is worthwhile to add code complexity.
I don't see any existing Base64 JMH tests but they would be in the repo
below or near:
Please contribute a JMH test and results to show the difference.
On 8/19/20 2:10 PM, Corey Ashford wrote:
> Michihiro Horie posted up a new iteration of this webrev for me. This
> time the webrev includes a complete implementation of the intrinsic
> for Power9 and Power10.
> You can find it here:
> Changes in webrev.02 vs. webrev.01:
> * The method header for the intrinsic in the Base64 code has been
> rewritten using the Javadoc style. The clarity of the comments has
> been improved and some verbosity has been removed. There are no
> additional functional changes to Base64.java.
> * The code needed to martial and check the intrinsic parameters has
> been added, using the base64 encodeBlock intrinsic as a guideline.
> * A complete intrinsic implementation for Power9 and Power10 is
> * Adds some Power9 and Power10 assembler instructions needed by the
> intrinsic which hadn't been defined before.
> The intrinsic implementation in this patch accelerates the decoding of
> large blocks of base64 data by a factor of about 3.5X on Power9.
> I'm attaching two Java test cases I am using for testing and
> benchmarking. The TestBase64_VB encodes and decodes randomly-sized
> buffers of random data and checks that original data matches the
> encoded-then-decoded data. TestBase64Errors encodes a 48K block of
> random bytes, then corrupts each byte of the encoded data, one at a
> time, checking to see if the decoder catches the illegal byte.
> Any comments/suggestions would be appreciated.
> - Corey
> On 7/27/20 6:49 PM, Corey Ashford wrote:
>> Michihiro Horie uploaded a new revision of the Base64 decodeBlock
>> intrinsic API for me:
>> It has the following changes with respect to the original one posted:
>> * In the event of encountering a non-base64 character, instead of
>> having a separate error code of -1, the intrinsic can now just return
>> either 0, or the number of data bytes produced up to the point where
>> the illegal base64 character was encountered. This reduces the number
>> of special cases, and also provides a way to speed up the process of
>> finding the bad character by the slower, pure-Java algorithm.
>> * The isMIME boolean is removed from the API for two reasons:
>> - The current API is not sufficient to handle the isMIME case,
>> because there isn't a strict relationship between the number of input
>> bytes and the number of output bytes, because there can be an
>> arbitrary number of non-base64 characters in the source.
>> - If an intrinsic only implements the (isMIME == false) case as
>> ours does, it will always return 0 bytes processed, which will
>> slightly slow down the normal path of processing an (isMIME == true)
>> - We considered adding a separate hotspot candidate for the
>> (isMIME == true) case, but since we don't have an intrinsic
>> implementation to test that, we decided to leave it as a future
>> Comments and suggestions are welcome. Thanks for your consideration.
>> - Corey
>> On 6/23/20 6:23 PM, Michihiro Horie wrote:
>>> Hi Corey,
>>> Following is the issue I created.
>>> I will upload a webrev when you're ready as we talked in private.
>>> Best regards,
>>> Inactive hide details for "Corey Ashford" ---2020/06/24
>>> 09:40:10---Currently in java.util.Base64, there is a
>>> HotSpotIntrinsicCa"Corey Ashford" ---2020/06/24 09:40:10---Currently
>>> in java.util.Base64, there is a HotSpotIntrinsicCandidate and API
>>> for encodeBlock, but no
>>> From: "Corey Ashford" <cjashfor at linux.ibm.com>
>>> To: "hotspot-compiler-dev at openjdk.java.net"
>>> <hotspot-compiler-dev at openjdk.java.net>,
>>> "ppc-aix-port-dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>
>>> Cc: Michihiro Horie/Japan/IBM at IBMJP, Kazunori Ogata/Japan/IBM at IBMJP,
>>> joserz at br.ibm.com
>>> Date: 2020/06/24 09:40
>>> Subject: RFR(S): [PATCH] Add HotSpotIntrinsicCandidate and API for
>>> Base64 decoding
>>> Currently in java.util.Base64, there is a HotSpotIntrinsicCandidate and
>>> API for encodeBlock, but none for decoding. This means that only
>>> encoding gets acceleration from the underlying CPU's vector hardware.
>>> I'd like to propose adding a new intrinsic for decodeBlock. The
>>> considerations I have for this new intrinsic's API:
>>> * Don't make any assumptions about the underlying capability of the
>>> hardware. For example, do not impose any specific block size
>>> * Don't assume the underlying intrinsic can handle isMIME or isURL
>>> modes, but also let them decide if they will process the data
>>> of the settings of the two booleans.
>>> * Any remaining data that is not processed by the intrinsic will be
>>> processed by the pure Java implementation. This allows the
>>> intrinsic to
>>> process whatever block sizes it's good at without the complexity of
>>> handling the end fragments.
>>> * If any illegal character is discovered in the decoding process, the
>>> intrinsic will simply return -1, instead of requiring it to throw a
>>> proper exception from the context of the intrinsic. In the event of
>>> getting a -1 returned from the intrinsic, the Java Base64 library code
>>> simply calls the pure Java implementation to have it find the error and
>>> properly throw an exception. This is a performance trade-off in the
>>> case of an error (which I expect to be very rare).
>>> * One thought I have for a further optimization (not implemented in
>>> the current patch), is that when the intrinsic decides not to process a
>>> block because of some combination of isURL and isMIME settings it
>>> doesn't handle, it could return extra bits in the return code, encoded
>>> as a negative number. For example:
>>> Illegal_Base64_char = 0b001;
>>> isMIME_unsupported = 0b010;
>>> isURL_unsupported = 0b100;
>>> These can be OR'd together as needed and then negated (flip the sign).
>>> The Base64 library code could then cache these flags, so it will know
>>> not to call the intrinsic again when another decodeBlock is requested
>>> but with an unsupported mode. This will save the performance hit of
>>> calling the intrinsic when it is guaranteed to fail.
>>> I've tested the attached patch with an actual intrinsic coded up for
>>> Power9/Power10, but those runtime intrinsics and arch-specific patches
>>> aren't attached today. I want to get some consensus on the
>>> library-level intrinsic API first.
>>> Also attached is a simple test case to test that the new intrinsic API
>>> doesn't break anything.
>>> I'm open to any comments about this.
>>> Thanks for your consideration,
>>> - Corey
>>> Corey Ashford
>>> IBM Systems, Linux Technology Center, OpenJDK team
>>> cjashfor at us dot ibm dot com
>>> [attachment "decodeBlock_api-20200623.patch" deleted by Michihiro
>>> Horie/Japan/IBM] [attachment "TestBase64.java" deleted by Michihiro
More information about the core-libs-dev