RFR (JAXP): 8035577: Xerces Update: impl/xpath/regex/RangeToken.java

Lance Andersen - Oracle Lance.Andersen at oracle.com
Thu Mar 20 15:40:10 UTC 2014

I think this OK.

The comments with the o--o did not do much for me though and found them a bit confusing but perhaps I need more coffee this morning ?

Also, not sure we need the @author tag but I think its usage varies in the workspace

On Mar 19, 2014, at 7:10 PM, David Li wrote:

> Hi,
> This is an update from Xerces for file impl/xpath/regex/TokenRange.java.  For details, please refer to: https://bugs.openjdk.java.net/browse/JDK-8035577.
> Webrevs: http://cr.openjdk.java.net/~joehw/jdk9/8035577/webrev/
> Existing tests: JAXP SQE and unit tests passed.
> Test cases added for typo fix in RangeToken.intersectRanges.  Code also updated to fix a bug where regular expression intersection returns incorrect value when first range ends later than second range.   Example below. Test cases have been added to cover any scenarios that the code changes affect.
> new RegularExpression("(?[b-d]&[a-r])"); -> returns [b-d] (Correct)
> new RegularExpression("(?[a-r]&[b-d])"); -> returns [b-de-r] (Incorrect)
> Thanks,
> David
> P.S. Notes on bug fixes.
> 1) Line 404 removal of while loop.
> This fixes a new bug where incorrect results are given when first range ends later than second range.  In the old code we got
> (?[a-r]&[b-d]) -> returns [b-de-r]
> By removing the while loop, we get [b-d].
> This while loop looks like a copy-paste error from subtractRanges. In subtractRanges we need to keep the leftover portion from the first range, but this does not apply to intersection.
> 2) Line 388, addition of src2 += 2;
> This code change affects anything of the form (?[a-r]&[b-eg-j]).  The code execution is diagrammed below.
> o------------o  (src1)
>  o--o o--o     (src2)
> For the first match we get
> o------------o  (src1)
>  o--o          (src2)
> Next we want to run src2+=2 to get the second pair of endpoints (since the first two endpoints are already used).  Notice how src1begin has been updated to this.ranges[src1] = src2end+1, which is directly from the code.
>      o------o  (src1)
>       o--o     (src2)
> The src2+=2 statement was left out of the old code, and is added in this webrev.  If we leave out the src2+=2 at line 388, on the next iteration of the large while loop we will reach case "} else if (src2end < src1begin) {" which also executes "src2+=2".  This means the correct final result is generated, but on a later loop. We want to add the new code because it's better to have all associated variable updated in the sameloop.  In addition, all the other conditions have similar src1 or src2 updates.

-------------- next part --------------

Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering 
1 Network Drive 
Burlington, MA 01803
Lance.Andersen at oracle.com

More information about the core-libs-dev mailing list