Rewrite of IBM doublebyte charsets

Ulf Zibis Ulf.Zibis at
Thu May 21 20:24:36 UTC 2009

Am 21.05.2009 21:52, Xueming Shen schrieb:
> Ulf Zibis wrote:
>> Am 21.05.2009 01:48, Xueming Shen schrieb:
>>> Thanks for the 5 minutes:-)
>>> Your FindXYZcoderBugs tests are indeed very helpful to catch most of 
>>> the "inconsistent" behaviors
>>> between different paths by feeding the "random" inputs.
>>> The is diffing the behaviors of old implementation 
>>> and new implementation
>>> with all "decode-able" bytes and "encode-able" it gives 
>>> us some of the guarantee.
>> Why do we *try* to stick on old behaviour in case of malformed and/or 
>> unmappable input, if we don't diff new against old ?
>> Then we also could *try*, to treat malformed and/or unmappable input 
>> most accurate.
>> As you mentioned, most users don't distinguish between those, so they 
>> won't be affected. On the other hand, user's, who did this 
>> distinction, would probably happy to return more accurate results, 
>> even if not identical to recent results.
> This is the approach/plan I decided to go with to achieve the goals I 
> listed last time. Sticking with the old behavior for
> now make it easy, or say possible, to push in such a big change.

I happily can agree with this. Thanks for your further detailed 
explanation. :-)

> You don't want to be stuck on this kind of "arguable"
> issues when it's not the main goal of the project, detour yourself  to 
> defend/argue whether or not this is the "correct"
> change,  if it's correct,  then is this the right thing to do to break 
> the compatibility, is there people depend on them. If
> you just start a new implementation, you definitely should do all the 
> "right" things. It is a different story when you
> maintenance some existing products. As I said last time, with this 
> change, the implementation, the data structure are
> now real open and ready for further optimization (instead of looking 
> at a big chunk of data without knowledge where
> they come from), you can  now work on the issue, if any, one by one, 
> including starting the argument of which error
> should be "malformed" and which one should "unmapped". We're (I'm) 60% 
> done after this:-)


More information about the core-libs-dev mailing list