java-nio-charset-enhanced -- Milestone 4 is released

Martin Buchholz martinrb at
Sun Mar 29 18:27:26 UTC 2009

On Fri, Mar 27, 2009 at 15:44, Ulf Zibis <Ulf.Zibis at> wrote:
> Am 27.03.2009 22:49, Martin Buchholz schrieb:
>> Again, Ulf, I love the sort of stuff you're doing.
> Much thanks again for the flowers. :-)
>> I hope to be able to contribute some enginering
>> to your effort myself someday.
>> In the meantime, we need some infrastructure to guarantee that
>> the behavior of the charsets is completely unchanged as we optimize.
>> I have some code left behind at Sun to do that, i.e. compare different
>> JDKs w.r.t charset compatibility.
>> Hopefully Sun engineers can resurrect that code and perhaps put it
>> into a public mercurial repo somewhere.
>> Another approach is to take the code in tests like my
>> Find{En,De} tests which compare direct
>> vs. regular buffers, and retarget it to compare two different jdks.
> I also have coded such a test for full-scan comparision:
> See CharsetsTest + LegacyCharset (it retrieves the legacy charsets by
> reflection directly from rt.jar of the patched JDK) here:
> It cost me several nights having all code points equal, faced to my special
> mixture of range-limited direct maps and full-range indirected map.

It does look like you've written a lot of good tests.
It would be nice not to have an explicit list of charsets in
I guess it's a list of charsets subject to single-byte testing?
If so, better documentation would be good.
Charsets named ISO-8859-* are guaranteed to be single-byte,
it might be good to include those programmatically,
by filtering Charsets.availableCharsets().
Why include EUC-JP but not UTF-8?

It's probably still a good idea to get inspiration from my
Find*Bugs tests which test many other things like
complete compatibility of exceptions in case of invalid input.

>> It's too difficult to give credit to external contributors.
>> One problem is that the Contributed-by: line is a red flag to
>> lawyers and other folks that might cause the legality of the change
>> to be questioned without end.  Let's try to get Ulf a proper commit bit
>> and make sure the legal questions come to an end.
> Aren't "Contributed-by" and "author" comments usual practice in open source
> products?
> Even in Sun's JRL source author was mentioned. I think, the lawyer guys and
> girls from Sun should rethink that subject.
> Ok, we will see ...

The problem is more human.  One would like to give credit for good ideas
or good analysis, but the only official way to give credit in a commit
message is
via a simple
Contributed-by: email-address
which raises legal doubts even when there is no copyrighted material.
I guess one can abuse the Summary: field to squeeze in thank-yous,
but it's pretty obvious that you are circumventing the process.


