RFR JDK-8164278: java.util.Base64.EncOutputStream/DecInputStream is slower than corresponding version in javax.mail package

Roger Riggs Roger.Riggs at Oracle.com
Tue Feb 6 16:28:56 UTC 2018

Hi Sherman,

On 2/5/2018 9:00 PM, Xueming Shen wrote:
> Hi,
> Please help review the change for  JDK-8164278.
> issue: https://bugs.openjdk.java.net/browse/JDK-8164278
> webrev: http://cr.openjdk.java.net/~sherman/8164278/webrev
Are the reentrant locks necessary?  concurrent reads from streams are 
not usually
synchronized so its the caller that need to synchronize.
If locks are necessary, why no lock for the EncOutputStream buffer?

809: Can the buffer byte array be sized based on linemax?  The field 
declaration should
   be with the other fields at the top of the file.

848: checkNewline compares == with linemax; that works when each byte is 
counted separately
  It seems like it would safer if it was ">=".

957: can sbBuf be shared with lbuf?

More on the input buffering question below.
> jmh.src: http://cr.openjdk.java.net/~sherman/8164278/Base64BM.java
> jmh.result: http://cr.openjdk.java.net/~sherman/8164278/base64.bm
> Base64.Decoder.decode0:
>     Adopted the "similar" optimization approach we took in 
> Base64.Encoder.encode0()
>     to add a "fast path" to decode a block of 4-byte units together 
> (current implementation
>     decodes one single byte per while/loop. The jmh benchmark result 
> indicates a big speed
>     boost  (those decodeArray/Mime/Url results, from 30% to 2 times 
> faster, depends on
>     input size).
> Base64.Encoder.encode0()
>     It appears encode0() was fully optimized in 1.8. Can't get it 
> faster :-) Tried to use
>     Unsafe.getLong/putLong instead of byte by byte access. But it 
> appears the 8-byte
>     "vectorization" does not bring us enough speed up, the performance 
> is the same as the
>     current one. See encode00() at
>     http://cr.openjdk.java.net/~sherman/8164278/webrev.00
> Base64.Encoder.wrap(OutputStream)/EncOutputStream.write():
>     If my memory serves me right, the current implementation was under 
> the assumption that
>     the underlying output stream probably is buffered (if invoker 
> cares). It would be a redundant
>     if EncOutputStream buffers bytes again. It appears this is a wrong 
> assumption. It is much
>     slower to write 4 bytes separately, compared to bundle them 
> together in a byte[4] and write
>     into underlying, even the underlying output stream is a 
> ByteArrayOutputStream.
>     Again, the proposed change is to add a fast path loop, as we do in 
> encode0(), to decode/
>     write a block of 3-byte->4-byte unites. It appears this fast loop 
> can help the compiler to
>     optimize away some boundary checks, therefor is much faster.
>     The jmh result Base64BM.encodeOS suggests the new implementation 
> is almost 4 times faster
>     and is almost the same as java.mail's stream encoder.
> Base64.Decoder.wrap(InputStream)/DecInputStream.read():
>     Same as the approach we take for decode0(), to add a fast path 
> decode block of 4-byte unites
>     together.
>     The jmh result Base64BM.decodeOS (the name probably should be 
> decodeIS, for InputStream,
>     but anyway...) shows the proposed one is 4 times faster than the 
> existing impl and double
>     the  java.mail (Base64BM.decodeOS_javamail) implementation.
>     However, there is a side-effect of adding a buffering mechanism 
> into DecInputStream. The
>     current implementation read bytes from the underlying stream one 
> by one, it never reads
>     more bytes than it needs, which means it should/is supposed to 
> just stop at the last byte
>     that it needs to decode, when there is "=" present in the stream.
>     With buffering, it's possible more bytes (after "=", which 
> indicates "end of base64 stream") might
>     be read/consumed in and buffered.  A concern? if this is indeed a 
> concern, the only alternative
>     might be to add a separate method to support this 
> "faster-buffered-decoder"?
How much buffering is needed to speed it up?  Can the mark/reset 
functions of the underlying
stream be used to backup the stream if it overshoots?
If mark and reset are not supported then read 1 byte at a time.

Regards, Roger

> Thanks,
> Sherman

More information about the core-libs-dev mailing list