RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime

David Holmes david.holmes at oracle.com
Wed Mar 20 18:02:50 PDT 2013


On 21/03/2013 7:27 AM, Yumin Qi wrote:
> Hi, can I have your code review of a small change?

Not really small conceptually. :)

I don't think this form of the fix addresses the underlying issue as 
discussed in the bug report. If the variable was renamed 
MinimumNumberOfProcessors, or MinimumAssumedProcessors, then simply 
using it to turn on is_MP would be okay. Such a flag would suit the 
initial problem perfectly. Of course any value >2 would be semantically 
indistinct, so this really acts as a boolean flag - AssumeMP.

The more general NumberOfProcessors approach, which I'm still unsure of, 
should to me control what is reported for available-processors. That way 
it would affect everything in the VM, libraries and application code 
that configures itself based on the number of available processors. The 
main usecase for that, in my opinion, would be for apps running on large 
systems but you want to constrain it to using a subset of the physical 
CPUs (without having to configure processor sets). That is a different 
kind of problem and a different kind of flag. Using NumberOfProcessors 
but not having it control anything except is_MP just seems wrong - and 
using it to replace available_processors is not a complex change.

The VM is not designed for dynamic adaptation of threads/pools so if the 
number of processors does change dynamically neither of the above 
options are going to provide solutions to the potential performance 
problems that will be encountered (too many or too few threads). Any 
apps that starts on single core (as reported by the OS) is going to be 
under-provisioned.

David
-----

> 2178143:  VM crashes if the number of bound CPUs changed during runtime.
>
> Situation: Customer first configure only one CPU online and turn others
> offline to run java application, after java program started, bring more
> CPUs back online. Since VM started on a single CPU, os::is_MP() will
> return false, but after more CPUs available, OS will schedule the app
> run on multiple CPUs, this caused SEGV in various places where data
> consistency was broken. The solution is supply a flag to assume it is
> running on MP, so lock is forced to be called.
>
> http://cr.openjdk.java.net/~minqi/2178143/
>
> Thanks
> Yumin


More information about the hotspot-gc-dev mailing list