review request for 6798511/6860431: Include functionality of Surrogate in Character
martinrb at google.com
Sun Mar 21 16:23:28 UTC 2010
On Sun, Mar 21, 2010 at 04:28, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> On Sat, Mar 20, 2010 at 17:13, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> I don't think it's a performance problem in the real world.
> Hm, if someone uses:
> if (Character.isBMPCodePoint(codePoint))
> else if (Character.isSupplementaryCodePoint(codePoint)) // instead
> he will loose up to 50 % performance as you can see on my benchmark on
Only if their data is full of supplementary characters.
> We don't usually put such performance information in the javadoc.
> In class StringBuilder:
> "Where possible, it is recommended that this class be used in preference to
> StringBuffer as it will be faster under most implementations."
> Note that these operations may execute in time proportional to the index
> value for some implementations (the LinkedList class, for example).
> In other words, an invocation of this method of the form
> src.get(dst, off, len) has exactly the same effect as the loop
> for (int i = off; i < off + len; i++)
> dst[i] = src.get();
> except that it first checks that there are sufficient bytes in this buffer
> and it is potentially much more efficient.
In the above, the performance is a Raison d'être of the API,
that real users should consider when choosing API.
> Anyway, even if isSupplementaryCodePoint() is used isolated, my code will
> help JIT to use 2-byte shifted adressing and shorter 2-byte immediate value
> for the compare, but yes, JIT should be able to catch that without this
> help. But for that case, we could stay on the old implementations too for
> isBMPCodePoint and is ValidCodePoint.
Again, performance with BMP characters is infinitely more important
than performance with supplementary characters.
More information about the core-libs-dev