Unicode script support in Regex and Character class

Ulf Zibis Ulf.Zibis at gmx.de
Mon May 10 17:53:49 UTC 2010

Am 10.05.2010 03:05, schrieb Xueming Shen:
> Ulf,
> Can you be more specific? I'm not sure I understand your question. 
> What "buffering"
> are we talking here?

In http://cr.openjdk.java.net/~sherman/6945564_6948903/webrev ,
I think byte[] ba could be saved in initNamePool(), as you could 
directly read from dis.

In http://cr.openjdk.java.net/~sherman/script/webrev.00/:
         wordPool = new String(pool, "iso-8859-1").toCharArray();
1st copies to pool[] 2nd copies to internal String.value[] and then 3rd 
again to wordPool.
You could:
do {
     wordPool[i++] = (char)dis.read();

             startCP = dHead.readInt();
             numCP = dHead.readShort() & 0xffff;
                 prefixOff = dIndex.readShort() & 0xffff;
                 int len = dIndex.read() & 0xff;
                     wordOff[off++] = (char)dIndex.readShort();
If you would fold index[] into head[], here you could also directly read 
the values from dis.

             wordOff = new char[index.length];
If you would init wordOff to it's true final size, you could save:
             wordOff = Arrays.copyOf(wordOff, off);

Additionally I'm wondering about your love on while loops.
In most cases I would prefer for loops with the concerning params 
defined in the for statement.


> Ulf Zibis wrote:
>> Sherman, I don't understand, why you use so much buffering.
>> InputStream from getResourceAsStream, and I believe 
>> InflaterInputStream too, is yet buffered.
>> My understanding until now was, that access to buffered byte streams 
>> is as fast as to naked byte arrays.
>> Am I wrong?
>> -Ulf

More information about the core-libs-dev mailing list