LDAP URI (Re: Question about 6961765)
zhouyx at linux.vnet.ibm.com
Fri Mar 2 09:32:59 UTC 2012
Thank you very much for this detail explanation.
As it is a simple modification to get this enhancement, and we
had an issue about it, please re-open it.
On Fri, Mar 2, 2012 at 4:09 PM, Weijun Wang <weijun.wang at oracle.com> wrote:
> LDAP URL (RFC 4516 2.1) specifies that only <reserved>, <unreserved>, and
> <pct-encoded> chars can be used, which do not include general non-ASCII
> unicode. UrlUtil deals with these chars correctly.
> The javadoc of URLDecoder  also only allows these characters, and it
> says --
> There are two possible ways in which this decoder could deal with
> illegal strings. It could either leave illegal characters alone or
> it could throw an IllegalArgumentException. Which approach the
> decoder takes is left to the implementation.
> Now the Oracle implementation of the class "leave illegal characters
> alone" and a Unicode char is still Unicode and you get the correct result.
> In this sense, UrlUtil is not as good as URLDecoder. It neither leave them
> alone nor throw an exception. Therefore, maybe it's better to use
> URLDecoder here, but before any spec officially supports "other" characters
> (a category defined in the URI class, including non-ASCII non-control
> non-space Unicode chars), it's better to use 100% legal chars in an LDAP
> If you have a strong request, I can re-open the bug.
>  http://docs.oracle.com/javase/**7/docs/api/java/net/**URLDecoder.html<http://docs.oracle.com/javase/7/docs/api/java/net/URLDecoder.html>
> On 03/02/2012 02:15 PM, Sean Chou wrote:
>> But UrlUtil.decode(DN, "UTF8") and URLDecoder.decode(DN, "UTF8")
>> are returning
>> different strings, if DN has invalid encoding, why URLDecoder.decode(DN,
>> "UTF8") can
>> decode it ?
>> On Thu, Mar 1, 2012 at 4:21 PM, Weijun Wang <weijun.wang at oracle.com
>> <mailto:weijun.wang at oracle.com**>> wrote:
>> Added some evaluation. Copied here:
>> The URL in the testcase has an invalid encoding. Its Unicode characters
>> must be encoded in UTF-8. For example,
>> \u3070 -> \e3\81\b0 -> %5Ce3%5C81%5Cb0
>> On 03/01/2012 03:39 PM, Sean Chou wrote:
>> Hi all,
>> I just encountered this bug:
>> But it is
>> closed as "NOT A BUG" without any comments.
>> Would anyone take a look and give it a comment ? Thanks.
>> Best Regards,
>> Sean Chou
More information about the core-libs-dev