RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime
david.holmes at oracle.com
Wed Mar 20 18:02:50 PDT 2013
On 21/03/2013 7:27 AM, Yumin Qi wrote:
> Hi, can I have your code review of a small change?
Not really small conceptually. :)
I don't think this form of the fix addresses the underlying issue as
discussed in the bug report. If the variable was renamed
MinimumNumberOfProcessors, or MinimumAssumedProcessors, then simply
using it to turn on is_MP would be okay. Such a flag would suit the
initial problem perfectly. Of course any value >2 would be semantically
indistinct, so this really acts as a boolean flag - AssumeMP.
The more general NumberOfProcessors approach, which I'm still unsure of,
should to me control what is reported for available-processors. That way
it would affect everything in the VM, libraries and application code
that configures itself based on the number of available processors. The
main usecase for that, in my opinion, would be for apps running on large
systems but you want to constrain it to using a subset of the physical
CPUs (without having to configure processor sets). That is a different
kind of problem and a different kind of flag. Using NumberOfProcessors
but not having it control anything except is_MP just seems wrong - and
using it to replace available_processors is not a complex change.
The VM is not designed for dynamic adaptation of threads/pools so if the
number of processors does change dynamically neither of the above
options are going to provide solutions to the potential performance
problems that will be encountered (too many or too few threads). Any
apps that starts on single core (as reported by the OS) is going to be
> 2178143: VM crashes if the number of bound CPUs changed during runtime.
> Situation: Customer first configure only one CPU online and turn others
> offline to run java application, after java program started, bring more
> CPUs back online. Since VM started on a single CPU, os::is_MP() will
> return false, but after more CPUs available, OS will schedule the app
> run on multiple CPUs, this caused SEGV in various places where data
> consistency was broken. The solution is supply a flag to assume it is
> running on MP, so lock is forced to be called.
More information about the hotspot-gc-dev