Optimizing UUID#fromString(String)

Roger Riggs Roger.Riggs at Oracle.com
Sat Jan 27 20:24:27 UTC 2018

Hi Jon,

Promising based on the cited performance improvements.

Can you review a similar thread and improvements from 2013 to see if 
there are secondary
considerations that have already been raised.

Please include the patch in the body of the email or attach it to the 
email (with a .txt extension)
to meet OpenJDK IP requirements.

Compatibility requirements would rule out tightening the acceptable 
format so a fallback
to a fully compatible parse would be needed.

Thanks, Roger

On 1/27/2018 11:05 AM, Jon Chambers wrote:
> Hello!
> I've recently had reason to take a look at performance around parsing and
> stringifying UUIDs. In exploring the space, I identified some opportunities
> to optimize the implementation of UUID#fromString as it currently exists (
> http://hg.openjdk.java.net/jdk/jdk/file/fd237da7a113/src/java.base/share/classes/java/util/UUID.java#l196
> ).
> Because UUID strings are of a known structure and length (32 hexadecimal
> digits and four dashes) and because UUIDs are exactly 128 bits in length,
> we know exactly how each character in a UUID string maps to bits in the
> parsed UUID. We always know, for example, that the first character in a
> UUID string maps to the four highest bits in the UUID, the second character
> maps to the four bits below that, and so on.
> With that knowledge, we can cut out a lot of the generality and
> bounds-checking we'd normally expect of a string-to-number parser. I've
> built an implementation with that in mind:
> https://github.com/jchambers/fast-uuid/blob/master/src/main/java/com/eatthepath/uuid/FastUUID.java#L108.
> In benchmarks (
> https://github.com/jchambers/fast-uuid/blob/master/benchmark/src/main/java/com/eatthepath/UUIDBenchmark.java#L55-L63),
> this implementation is about six times faster than the current JDK
> implementation (9.0.4+11) and 14 times faster than the implementation in
> 1.8.
> The experimental implementation is more strict about UUID format (the
> current JDK implementation allows for variable-length blocks of hex digits
> between dashes while the experimental one doesn't), and I'll defer to you
> folks as to whether its handling of technically-malformed UUID strings is
> acceptable. As discussed via Twitter (
> https://twitter.com/cl4es/status/956308599277486080), we might consider
> using the fixed-length parsing approach if we know the UUID string is
> exactly 36 characters long and fall back to the looser parser otherwise. I
> also recognize that this is partially reinventing the wheel when it comes
> to parsing hex strings, and the tradeoff between consistency and
> performance is certainly worthy of consideration.
> Regardless, I wanted to call this optimization opportunity to your
> attention, and would be happy to offer a proper patch if this seems like a
> worthwhile change.
> Cheers, and thank you for your consideration!
> -Jon

More information about the core-libs-dev mailing list