Rewrite of IBM doublebyte charsets
Ulf.Zibis at gmx.de
Thu May 21 13:24:36 PDT 2009
Am 21.05.2009 21:52, Xueming Shen schrieb:
> Ulf Zibis wrote:
>> Am 21.05.2009 01:48, Xueming Shen schrieb:
>>> Thanks for the 5 minutes:-)
>>> Your FindXYZcoderBugs tests are indeed very helpful to catch most of
>>> the "inconsistent" behaviors
>>> between different paths by feeding the "random" inputs.
>>> The TestIBMDB.java is diffing the behaviors of old implementation
>>> and new implementation
>>> with all "decode-able" bytes and "encode-able" chars...so it gives
>>> us some of the guarantee.
>> Why do we *try* to stick on old behaviour in case of malformed and/or
>> unmappable input, if we don't diff new against old ?
>> Then we also could *try*, to treat malformed and/or unmappable input
>> most accurate.
>> As you mentioned, most users don't distinguish between those, so they
>> won't be affected. On the other hand, user's, who did this
>> distinction, would probably happy to return more accurate results,
>> even if not identical to recent results.
> This is the approach/plan I decided to go with to achieve the goals I
> listed last time. Sticking with the old behavior for
> now make it easy, or say possible, to push in such a big change.
I happily can agree with this. Thanks for your further detailed
> You don't want to be stuck on this kind of "arguable"
> issues when it's not the main goal of the project, detour yourself to
> defend/argue whether or not this is the "correct"
> change, if it's correct, then is this the right thing to do to break
> the compatibility, is there people depend on them. If
> you just start a new implementation, you definitely should do all the
> "right" things. It is a different story when you
> maintenance some existing products. As I said last time, with this
> change, the implementation, the data structure are
> now real open and ready for further optimization (instead of looking
> at a big chunk of data without knowledge where
> they come from), you can now work on the issue, if any, one by one,
> including starting the argument of which error
> should be "malformed" and which one should "unmapped". We're (I'm) 60%
> done after this:-)
More information about the core-libs-dev