RFR (S): 8161993: 8161993: G1 crashes if active_processor_count changes during startup
david.holmes at oracle.com
Thu Jul 21 12:31:29 UTC 2016
On 21/07/2016 9:28 PM, Thomas Schatzl wrote:
> Hi all,
> can I have reviews for this small change that fixes crashes with G1
> when changing the number of available processors while running?
> The problem is that some data structures are allocated using the number
> of available processors at startup, and if you later change that number
> (i.e. increase it), g1 uses indices to access these data structures
> beyond their capacity.
> The problem is in DirtyCardQueueSet::num_par_ids() that re-queries the
> number of active processors, and then is used e.g. in G1FromCardCache
> to index an array.
To be clear num_par_ids() queries the number of configured processors
which is constant across the life of the VM. But the number of parallel
GC threads is determined by the number of active processors available to
the VM - that value may be less than the number of configured
processors. (The fact it can change over time doesn't seem to be a
factor here - only that it may be smaller.)
> The proposed solution is to introduce an
> initial_active_processor_count() in the runtime that stays fixed after
> initialization. This corresponds to the suggestion in JDK-8147910, but
> only fixing the immediate crash here.
I'm not really seeing what else JDK-8147910 would want to add other than
reporting the initial count in the error reporting, so I'd rather see
this split into two parts - and have both issues closed out. All that
would involve is committing the os.hpp and os.cpp and any change to
error reporting under 8147910, and the rest under 8161993. A single
review thread and combined push is still fine.
That said I would not hide this inside os::set_processor_count the way
you have. I can see you have done it to avoid changing all the
os_xxx.cpp files, but the name no longer really fits as it is not a
simple setter operation any more, and it is odd to pass in one value but
then internally pull the other one. Albeit more work I would have added
set_initial_active_processor_count(int initial) and
initial_active_processor_count(), then have the OS initialization
methods call both set_processor_count and
set_initial_active_processor_count. Or it may be there is somewhere in
the shared os initialization code we could add that, once, before it
needs to be used for the worker thread calculation. I'd have to look at
that in more detail tomorrow.
Also if we log the initial_active_count we should log the
> I would like to have reviewers from both the gc and runtime team for
> this change; I cc'ed both mailing lists.
> manual testing, jprt, running vm.gc right now
More information about the hotspot-gc-dev