RFR(m) 2: 8072722: add stream support to Scanner

Peter Levart peter.levart at gmail.com
Thu Sep 17 07:11:24 UTC 2015

As an alternative to additional boolean field, you could use one bit of 
expectedCount/modCount int field(s):

- let initial value of expectedCount be 1 (odd value)
- instead of (expectedCount >= 0) ==> (expectedCount != 1)
- let initial value of modCount be 0 (even value)
- instead of modCount++ ==> modCount += 2;

Regards, Peter

On 09/17/2015 01:08 AM, Stuart Marks wrote:
> On 9/16/15 8:43 AM, Xueming Shen wrote:
>> I'm talking about the check "immediately" prior to the call to 
>> accept(). It
>> will not function after the modCount tips over to the negative int 
>> value,
>> because the "expectedCount >=0" check.
>> Consider the use scenario that the Scanner is on top of an endless input
>> stream, you have a token stream on top of it. The check before the
>> "accept(token" will not be performed until the expectedCount/modCount 
>> tips
>> back to positive value again from the negative, then off, then on... 
>> During
>> the off period (it will take a while from negative back to positive), 
>> the
>> stream will just work fine to feed the accept() the "next" token even if
>> there is another thread keeps "stealing" tokens from the same 
>> scanner, if the
>> timing is right.  Looks like not really a "fail-fast" in this scenario.
> Right, after modCount wraps around to negative, the CME checking 
> becomes dysfunctional. It doesn't do any harm, but it ceases to 
> perform proper checking. This is kind of a corner case but ... well I 
> admit I did have to do a fair bit of puzzling to figure out what the 
> behavior would be and to prove to myself that it was benign.
>> This can be "easily" addressed, if you have a separate boolean field 
>> such as
>> "initialized". The code can look like below in tryAdvance(...)
> [edited per your subsequent message]
>>      if (!initialized) {
>>          expectedCount = modCount;
>>          initialized = true;
>>      }
>>      if (expectedCount != modCount) {
>>          throw new CME();
>>      }
>>      ...
>> Well, if you think this is an unlikely use scenario and the intention 
>> of the
>> check/guard here is mainly to prevent the wrong doing within the pipe
>> operation, then it might not worth the extra field, and I'm fine with 
>> the
>> latest webrev.
> The cost of the additional field is negligible. I haven't written out 
> the code but I suspect that an explicit "initialized" field will be 
> easier to reason about, certainly easier to understand than the 
> behavior that occurs if modCount wraps around to negative.
> Note that this also applies to Matcher, although it's less likely 
> since Matcher's input is a CharSequence instead of an indefinite-sized 
> source such as a file or an input stream. In talking to Paul Sandoz 
> about this (author of the streams stuff for Matcher) he felt it was 
> important to keep the behaviors of Matcher and Scanner consistent.
> But this has dragged out somewhat and I don't really want to add 
> Matcher changes to this changeset. How about I do the following:
> 1) I'll push the latest webrev as it stands.
> 2) I'll file a separate bug to clean up Scanner's and Matcher's 
> spliterators' modCount checking to avoid the overflow issue.
> 3) I'll fix all the spliterators at the same time.
> How does that sound?
> s'marks

More information about the core-libs-dev mailing list