<div dir="ltr">The latest webrev looks great to me!</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Dec 17, 2013 at 10:34 PM, Eric Wang <span dir="ltr">&lt;<a href="mailto:yiming.wang@oracle.com" target="_blank">yiming.wang@oracle.com</a>&gt;</span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    <div>Hi Martin,<br>
      <br>
      I have updated the test to add another 2 surrogates, solitary high
      and solitary low, so 4 possibilities are covered.<br>
      The verbose code has been cleaned based on your suggestion, Can
      you please review again? <br>
      <a href="http://cr.openjdk.java.net/%7Eewang/JDK-8022879/webrev.01/" target="_blank">http://cr.openjdk.java.net/~ewang/JDK-8022879/webrev.01/</a><br>
      <br>
      Thanks,<br>
      Eric<div><div class="h5"><br>
      On 2013/12/18 3:03, Martin Buchholz wrote:<br>
    </div></div></div><div><div class="h5">
    <blockquote type="cite">
      <div dir="ltr">Thanks, Eric.
        <div><br>
        </div>
        <div>This looks good enough to check in.  Optionally:</div>
        <div><br>
        </div>
        <div>Regarding my comment on what surrogate to test, I would
          tend to check 4 possibilities: solitary high surrogate,
          solitary low, high + low, low + high</div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>The code below is a bit more verbose than necessary.  We
          can remove f by doing</div>
        <div><br>
        </div>
        <div>try {</div>
        <div>  ...</div>
        <div>  throw new ..(&quot;should throw ...&quot;);</div>
        <div>} catch (<span style="color:blue">MalformedInputException |
            UnmappableCharacterException success) {</span><span style="color:blue">}</span></div>
        <div><span style="color:blue"><br>
          </span></div>
        <div><span style="color:blue">en.reset();</span></div>
        <div>
          <pre style><span style="color:blue">  69         boolean f = false;</span>
<span style="color:blue">  70         try {</span>
<span style="color:blue">  71             en.encode(CharBuffer.wrap(surrogate));</span>
<span style="color:blue">  72         } catch (MalformedInputException | UnmappableCharacterException ex) {</span>
<span style="color:blue">  73             f = true;</span>
<span style="color:blue">  74         } finally {</span>
<span style="color:blue">  75             en.reset();</span>
<span style="color:blue">  76         }</span>
<span style="color:blue">  77         if (!f) {</span>
<span style="color:blue">  78             throw new RuntimeException(&quot;testMalformedSurrogate failed with charset &quot; + <a href="http://cs.name" target="_blank">cs.name</a>());</span>
<span style="color:blue">  79         }</span>
</pre>
        </div>
        <div><span style="color:blue"><br>
          </span></div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Tue, Dec 17, 2013 at 1:18 AM, Eric
          Wang <span dir="ltr">&lt;<a href="mailto:yiming.wang@oracle.com" target="_blank">yiming.wang@oracle.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>Hi Martin,<br>
                <br>
                I have updated the test based on your comments and split
                the old logic into different functions to test legal or
                illegal surrogate pairs. Can you please help to review
                it?<br>
                <a href="http://cr.openjdk.java.net/%7Eewang/JDK-8022879/webrev.01/" target="_blank">http://cr.openjdk.java.net/~ewang/JDK-8022879/webrev.01/</a><br>
                <br>
                For your last comment &quot;reversed surrogate&quot;, I&#39;m not sure
                whether my understanding is correct. Does that mean
                reverse legal surrogate e.g. \uD800\uDC00 to
                \uDC00\uD800 which is kind of malformed surrogate.<br>
                I have covered this case in the test.<br>
                <br>
                Thanks,<br>
                Eric
                <div>
                  <div><br>
                    On 2013/12/14 8:15, Martin Buchholz wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div>
                  <blockquote type="cite">
                    <div dir="ltr">I do like this approach.  I think we
                      can have some cleanup to make it a little clearer
                      what is being tested.
                      <div><br>
                      </div>
                      <div>Below, it makes no sense to check canEncode
                        when you are creating a decoder.</div>
                      <div>
                        <pre><span style="color:blue;font-weight:bold">  60             try {</span>
<span style="color:blue;font-weight:bold">  61                 de = charset.newDecoder();</span>
<span style="color:blue;font-weight:bold">  62             } catch (UnsupportedOperationException ex) {</span>
<span style="color:blue;font-weight:bold">  63                 if (canEncode) {</span>
<span style="color:blue;font-weight:bold">  64                     throw ex;</span>
<span style="color:blue;font-weight:bold">  65                 }</span>
<span style="color:blue;font-weight:bold">  66             }</span>
</pre>
                      </div>
                      <div><span style="color:blue;font-weight:bold"><br>
                        </span></div>
                      <div><font color="#0000ff"><b>Once you do this:</b></font></div>
                      <div><span>
                          <pre style="font-weight:normal"><span style="color:blue">  68             if (!canEncode) {</span>
<span style="color:blue">  69                 continue;</span>
<span style="color:blue">  70             }</span>
</pre>
                          <div style="color:blue;font-weight:bold"><span style="color:blue">the loop should no
                              longer test canEncode, but does, obscuring
                              the logic.</span></div>
                          <div style="color:blue;font-weight:bold"><span style="color:blue"><br>
                            </span></div>
                          <div style="color:blue;font-weight:bold"><span style="color:blue">Use standard
                              whitespace:</span></div>
                          <div><span>
                              <pre style="font-weight:normal"><span style="color:blue"> 109                 }finally {</span>
</pre>
                              <div style="color:blue;font-weight:bold"><span style="color:blue">When encoding legal
                                  surrogate pair, should see </span><span style="font-weight:normal">UnmappableCharacterException,

                                  not </span><span style="font-weight:normal">MalformedInputException.</span></div>
                              <div style="color:blue;font-weight:bold"><span style="font-weight:normal">Reverse is
                                  true with illegal surrogate pair.</span></div>
                              <div style="color:blue;font-weight:bold"><span style="font-weight:normal"><br>
                                </span></div>
                              <div style="color:blue;font-weight:bold"><span style="font-weight:normal">should
                                  probably tests isolated, in addition
                                  to reversed, surrogates.</span></div>
                              <div style="color:blue;font-weight:bold"><span style="font-weight:normal"><br>
                                </span></div>
                              <div><span style="color:blue;font-weight:normal">Compare

                                  my old test </span><font color="#0000ff">FindCanEncodeBugs.</font></div>
                            </span></div>
                        </span></div>
                      <div><span style="color:blue;font-weight:bold"><br>
                        </span></div>
                    </div>
                    <div class="gmail_extra"><br>
                      <br>
                      <div class="gmail_quote">On Thu, Dec 12, 2013 at
                        4:51 AM, Eric Wang <span dir="ltr">&lt;<a href="mailto:yiming.wang@oracle.com" target="_blank">yiming.wang@oracle.com</a>&gt;</span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF">
                            <div>Hi Martin &amp; All,<br>
                              <br>
                              Please review the changes for
                              MalformedSurrogates.<br>
                              <a href="http://cr.openjdk.java.net/%7Eewang/JDK-8022879/webrev.00/" target="_blank">http://cr.openjdk.java.net/~ewang/JDK-8022879/webrev.00/</a><br>
                              <br>
                              The logic of changes are<br>
                              1. Iterate all charsets of system to check
                              whether the charset supports encode.<br>
                              2. If yes, check whether it supports
                              surrogate.<br>
                              3. If yes, check whether it rejects
                              malformed surrogate and can en(de)code
                              normal surrogate.<br>
                              <br>
                              Thanks,<br>
                              Eric
                              <div>
                                <div><br>
                                  On 2013/11/8 0:23, Martin Buchholz
                                  wrote:<br>
                                </div>
                              </div>
                            </div>
                            <div>
                              <div>
                                <blockquote type="cite">
                                  <div dir="ltr">I still like my old
                                    idea of iterating over all charsets
                                    and checking their reasonableness
                                    properties.
                                    <div><br>
                                    </div>
                                    <div>Probably all charsets bundled
                                      with the jdk should reject
                                      unpaired surrogates when encoding.
                                       Check it!  Failure to do so might
                                      be considered a security bug.</div>
                                    <div><br>
                                    </div>
                                    <div>I would
                                      use CodingErrorAction.REPORT
                                       instead of REPLACE and examine
                                      that the resulting exception
                                      occurs and has reasonable detail.</div>
                                    <div><br>
                                    </div>
                                    <div>For properly paired surrogates,
                                      check that the charset can encode
                                      the resulting codepoint using
                                      canEncode, and if so, check that
                                      encoding succeeds.</div>
                                  </div>
                                  <div class="gmail_extra"><br>
                                    <br>
                                    <div class="gmail_quote">On Thu, Nov
                                      7, 2013 at 7:22 AM, Eric Wang <span dir="ltr">&lt;<a href="mailto:yiming.wang@oracle.com" target="_blank">yiming.wang@oracle.com</a>&gt;</span>
                                      wrote:<br>
                                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                        <div text="#000000" bgcolor="#FFFFFF">
                                          <div lang="x-western">
                                            <div>
                                              <p class="MsoNormal"> H<span style="font-size:12.0pt">i Everyone</span></p>
                                              <p class="MsoNormal"><span style="font-size:12.0pt">I am working on bug </span><span style="font-size:12.0pt"><a href="https://bugs.openjdk.java.net/browse/JDK-8022879" target="_blank">https://bugs.openjdk.java.net/browse/JDK-8022879</a>. 



                                                  <br>
                                                  The test <a href="http://hg.openjdk.java.net/jdk8/tl/jdk/file/44fa6bf42846/test/sun/nio/cs/MalformedSurrogates.java" target="_blank">sun/nio/cs/MalformedSurrogates.java</a>
                                                  doesn&#39;t run if the
                                                  system default
                                                  encoding is UTF-8. But
                                                  unfortunately, UTF-8
                                                  is the default charset
                                                  of most test machines,
                                                  it means the test get
                                                  few chances to be
                                                  executed. <br>
                                                  Another defect is the
                                                  test would failed if
                                                  the default charset is
                                                  UTF-16 or UTF-32 as
                                                  the test doesn&#39;t take
                                                  the 2 charsets into
                                                  consideration.<br>
                                                </span></p>
                                              <p class="MsoNormal"><span style="font-size:12.0pt">The idea of fix  is no matter what system
                                                  charset it is, the
                                                  test should always be
                                                  executed. Here thanks
                                                  Martin&#39;s suggestion
                                                  that instead of
                                                  checking byte size, we
                                                  can use
                                                  CharsetEncoder.canEncode()
                                                  and </span><span style="font-size:12.0pt"><span style="font-size:12.0pt">CharsetEncoder.onMalformedInput(CodingErrorAction.REPLACE)



                                                  </span>to check and
                                                  replace malformed
                                                  chars.<br>
                                                </span></p>
                                              <p class="MsoNormal"><span style="font-size:12.0pt">So the test can be re-designed as below:<br>
                                                  <br>
                                                  1. To use
                                                  CharsetEncoder.canEncode()
                                                  to check whether the
                                                  string includes
                                                  malformed characters.
                                                  <br>
                                                  2. If a string
                                                  includes malformed
                                                  characters e.g.
                                                  &quot;abc\uD800\uDB00efgh&quot;,
                                                  then set
                                                  CharsetEncoder.onMalformedInput(CodingErrorAction.REPLACE)
                                                  to replace the
                                                  malformed characters
                                                  to the replacement &quot;?&quot;
                                                  when calling
                                                  CharsetEncoder.encode()
                                                  method. <br>
                                                  3. Verified by
                                                  decoding the encoded
                                                  ByteBuffer to
                                                  CharBuffer, check
                                                  whether it includes
                                                  replacement &quot;?&quot; and
                                                  compare it with old
                                                  string, if not equal,
                                                  then test passed.<br>
                                                  4. If a sting doesn&#39;t
                                                  include malformed
                                                  characters e.g.
                                                  &quot;abc\uD800\uDC00efgh&quot;,
                                                  the
                                                  CharsetEncoder.encode()
                                                  converts it to
                                                  ByteBuffer which
                                                  doesn&#39;t include
                                                  replacement &quot;?&quot; <br>
                                                  5. Verified by
                                                  decoding the encoded
                                                  ByteBuffer to
                                                  CharBuffer, confirm
                                                  that it doesn&#39;t
                                                  include replacment &quot;?&quot;
                                                  and compare it with
                                                  old string, if equal,
                                                  then test passed.<br>
                                                </span></p>
                                              <p class="MsoNormal"><span style="font-size:12.0pt">Please let me know if you have any comments or
                                                  suggestions.</span></p>
                                              <p class="MsoNormal"><span style="font-size:12.0pt">Thanks,<br>
                                                  Eric<br>
                                                </span></p>
                                            </div>
                                          </div>
                                        </div>
                                      </blockquote>
                                    </div>
                                    <br>
                                  </div>
                                </blockquote>
                                <br>
                              </div>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                      <br>
                    </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div>