LDAP URI (Re: Question about 6961765)
weijun.wang at oracle.com
Fri Mar 2 00:09:22 PST 2012
LDAP URL (RFC 4516 2.1) specifies that only <reserved>, <unreserved>,
and <pct-encoded> chars can be used, which do not include general
non-ASCII unicode. UrlUtil deals with these chars correctly.
The javadoc of URLDecoder  also only allows these characters, and it
There are two possible ways in which this decoder could deal with
illegal strings. It could either leave illegal characters alone or
it could throw an IllegalArgumentException. Which approach the
decoder takes is left to the implementation.
Now the Oracle implementation of the class "leave illegal characters
alone" and a Unicode char is still Unicode and you get the correct result.
In this sense, UrlUtil is not as good as URLDecoder. It neither leave
them alone nor throw an exception. Therefore, maybe it's better to use
URLDecoder here, but before any spec officially supports "other"
characters (a category defined in the URI class, including non-ASCII
non-control non-space Unicode chars), it's better to use 100% legal
chars in an LDAP URI.
If you have a strong request, I can re-open the bug.
On 03/02/2012 02:15 PM, Sean Chou wrote:
> But UrlUtil.decode(DN, "UTF8") and URLDecoder.decode(DN, "UTF8")
> are returning
> different strings, if DN has invalid encoding, why URLDecoder.decode(DN,
> "UTF8") can
> decode it ?
> On Thu, Mar 1, 2012 at 4:21 PM, Weijun Wang <weijun.wang at oracle.com
> <mailto:weijun.wang at oracle.com>> wrote:
> Added some evaluation. Copied here:
> The URL in the testcase has an invalid encoding. Its Unicode characters
> must be encoded in UTF-8. For example,
> \u3070 -> \e3\81\b0 -> %5Ce3%5C81%5Cb0
> On 03/01/2012 03:39 PM, Sean Chou wrote:
> Hi all,
> I just encountered this bug:
> <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6961765> .
> But it is
> closed as "NOT A BUG" without any comments.
> Would anyone take a look and give it a comment ? Thanks.
> Best Regards,
> Sean Chou
More information about the core-libs-dev