RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime

David Holmes david.holmes at oracle.com
Tue Mar 26 21:54:18 PDT 2013


On 27/03/2013 2:06 PM, Yumin Qi wrote:
> David,
>
> On 3/26/2013 7:49 PM, David Holmes wrote:
>> Well now you have confused me. I thought checking
>> FLAG_IS_DEFAULT(ParallelGCThreads) was the right thing to do (the user
>> has not attempted manage the number of GC threads even though they
>> recognize the number of processors may start at 1 and increase later).
>> I do not understand the new <2 test - maybe I really only want 1, if I
>> asked for that why should I get a warning ???
>>
> The previous code is issuing the warning before gc set flags (includes
> ParallelGCThreads), we don't want with AssumeMP turned on, running on a
> MP issues the warning since ParallelGCThreads will be not 1 but the
> number of calculation based on ncpus.  So I moved the check code after
> GC set flags.  After GC set flags, ParallelGCThreads is not the default
> value (0) so I check if it is < 2.

I was overlooking the case where you use +AssumeMP but have multiple 
processors anyway. But if I explicitly set ParallelGCThreads=1 then I 
don't expect to get a warning. Not sure we can distinguish these cases 
though.

Again I leave this to you and Jon to finalize.

Thanks,
David

> If users really want to run with single CPU with -XX:+AssumeMP, giving a
> warning is reasonable (this is our intention here) --- there is
> possibility they will increase number of cpus even they don't, that is OK.
>
> Thanks
> Yumin
>> But, this is a GC warning (and I think it is misplaced anyway) so if
>> Jon is happy with this behaviour so be it.
>>
>> Cheers,
>> David
>>
>> On 27/03/2013 5:44 AM, Yumin Qi wrote:
>>> Sorry, copied wrong link. It is at same link.
>>>
>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 3/26/2013 12:42 PM, Daniel D. Daugherty wrote:
>>>>
>>>> On 3/26/13 1:17 PM, Yumin Qi wrote:
>>>>> Hi, Jon
>>>>>
>>>>>   I made a change to the warning condition and move it after GC flags
>>>>> are set. I think at this time, ParallelGCThreads has a value != 0.
>>>>> Please take a look if it is right.  The This is the new webrev:
>>>>>
>>>>>   http://javaweb.us.oracle.com/~yqi/webrev/2178143/
>>>>
>>>> This webrev isn't accessible outside Oracle...
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>>   Tested with platform of ncpus > 1 and no ParallelGCThreads set with
>>>>> turning on AssumeMP.  The warning is gone.
>>>>>    Also tested with ncpus = 1 and the warning is on too.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 3/26/2013 8:03 AM, Jon Masamitsu wrote:
>>>>>> If I set AssumeMP on the command line and nothing else
>>>>>> and am running on a platform where I get ParallelGCThreads > 1,
>>>>>> am I going to see the warning?    If yes, that seems too intrusive.
>>>>>> I can see users hearing about the flag and deciding to include
>>>>>> it always just to be safe.
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>> On 3/25/13 11:21 PM, Yumin Qi wrote:
>>>>>>> You are right, thanks!  I misread the INCLUDE.
>>>>>>> new web in the same link.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Yumin
>>>>>>>
>>>>>>> On 3/25/2013 11:00 PM, David Holmes wrote:
>>>>>>>> On 26/03/2013 3:56 PM, Yumin Qi wrote:
>>>>>>>>> David,
>>>>>>>>>
>>>>>>>>> On 3/25/2013 10:41 PM, David Holmes wrote:
>>>>>>>>>> On 26/03/2013 3:30 PM, Yumin Qi wrote:
>>>>>>>>>>> David and All,
>>>>>>>>>>>
>>>>>>>>>>> Changed as suggested by David.
>>>>>>>>>>>
>>>>>>>>>>> David, I did not move the warning into set_parallel_gc_flags()
>>>>>>>>>>> since
>>>>>>>>>>> except for serial gc, all other GCs will check this flag, so I
>>>>>>>>>>> moved it
>>>>>>>>>>> into
>>>>>>>>>>> #if INCLUDE_ALL_GCS
>>>>>>>>>>>
>>>>>>>>>>> before call set GC flags.
>>>>>>>>>>
>>>>>>>>>> I didn't realize all the other GCs would utilize
>>>>>>>>>> ParallelGCThreads.
>>>>>>>>>> Can we at least exclude it for serial GC case?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Serial GC is set here:
>>>>>>>>
>>>>>>>> INCLUDE_ALL_GCS allows all GCs including Serial GC. Otherwise we
>>>>>>>> only have SerialGC. So you have excluded it for the case where
>>>>>>>> SerialGC is the only GC built into the VM, but not for the case
>>>>>>>> where all GCs are available and the user requests UseSerialGC.
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>>> 3253 #if !INCLUDE_ALL_GCS
>>>>>>>>> 3254   force_serial_gc();
>>>>>>>>> 3255 #endif // INCLUDE_ALL_GCS
>>>>>>>>>
>>>>>>>>> So after it moved into
>>>>>>>>> 3283 #if INCLUDE_ALL_GCS
>>>>>>>>> 3284   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>>>> 3285     warning("If the number of processors is expected to
>>>>>>>>> increase
>>>>>>>>> from one, then"
>>>>>>>>> 3286             " you should configure the number of parallel GC
>>>>>>>>> threads appropriately"
>>>>>>>>> 3287             " using -XX:ParallelGCThreads=N");
>>>>>>>>> 3288   }
>>>>>>>>> 3289   // Set per-collector flags
>>>>>>>>> 3290   if (UseParallelGC || UseParallelOldGC) {
>>>>>>>>> 3291     set_parallel_gc_flags();
>>>>>>>>> 3292   } else if (UseConcMarkSweepGC) { // should be done before
>>>>>>>>> ParNew
>>>>>>>>> check below
>>>>>>>>> 3293     set_cms_and_parnew_gc_flags();
>>>>>>>>> 3294   } else if (UseParNewGC) {  // skipped if CMS is set above
>>>>>>>>> 3295     set_parnew_gc_flags();
>>>>>>>>> 3296   } else if (UseG1GC) {
>>>>>>>>> 3297     set_g1_gc_flags();
>>>>>>>>> 3298   }
>>>>>>>>> 3299   check_deprecated_gcs();
>>>>>>>>> 3300 #else // INCLUDE_ALL_GCS
>>>>>>>>> 3301   assert(verify_serial_gc_flags(), "SerialGC unset");
>>>>>>>>> 3302 #endif // INCLUDE_ALL_GCS
>>>>>>>>>
>>>>>>>>> It is excluded for serial GC.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Yumin
>>>>>>>>>
>>>>>>>>>> Otherwise looks good.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> new webrev: http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/25/2013 5:44 PM, David Holmes wrote:
>>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>>
>>>>>>>>>>>> I have a few suggested changes.
>>>>>>>>>>>>
>>>>>>>>>>>> First in os.hpp I suggest
>>>>>>>>>>>>
>>>>>>>>>>>>  return _processor_count > 1 || AssumeMP ;
>>>>>>>>>>>>
>>>>>>>>>>>> That way we don't bypass the assertion and don't encounter
>>>>>>>>>>>> unnecessary
>>>>>>>>>>>> overhead on the common path.
>>>>>>>>>>>>
>>>>>>>>>>>> In arguments.cpp:
>>>>>>>>>>>>
>>>>>>>>>>>> +   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>>>>>>> +     warning("With AssumeMP, recommend run with
>>>>>>>>>>>> -XX:ParallelGCThreads=<num_of_gc_threads>, where"
>>>>>>>>>>>> +             "  num_of_gc_threads can be set to number of
>>>>>>>>>>>> cores");
>>>>>>>>>>>> +   }
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not convinced issuing a warning is necessary or desirable
>>>>>>>>>>>> here.
>>>>>>>>>>>> With a uniprocessor startup will we even default to using
>>>>>>>>>>>> parallel GC?
>>>>>>>>>>>> The above should only be issued if using parallel GC - so
>>>>>>>>>>>> better to
>>>>>>>>>>>> move it into set_parallel_gc_flags().
>>>>>>>>>>>>
>>>>>>>>>>>> In any case it isn't clear what the user should supply here.
>>>>>>>>>>>> If they
>>>>>>>>>>>> use the maximum expected number of cores initially then while
>>>>>>>>>>>> they
>>>>>>>>>>>> only have 1 core their VM may not even function adequately due
>>>>>>>>>>>> to GC
>>>>>>>>>>>> overhead. I think all you can say here is something like:
>>>>>>>>>>>>
>>>>>>>>>>>> warning("If the number of processors is expected to increase
>>>>>>>>>>>> from one,
>>>>>>>>>>>> then you should configure the number of parallel GC threads
>>>>>>>>>>>> appropriately using -XX:ParallelGCThreads=N");
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In globals.hpp:
>>>>>>>>>>>>
>>>>>>>>>>>> +   product(bool, AssumeMP, false,       \
>>>>>>>>>>>> +           "Assume run on multiple processors always") \
>>>>>>>>>>>>
>>>>>>>>>>>> Was there any discussion on making this default to true
>>>>>>>>>>>> instead?
>>>>>>>>>>>>
>>>>>>>>>>>> Suggested re-wording:
>>>>>>>>>>>>
>>>>>>>>>>>>  "Instruct the VM to assume multiple processors are available"
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> On 26/03/2013 9:15 AM, Yumin Qi wrote:
>>>>>>>>>>>>> It should be "AssumeMP".
>>>>>>>>>>>>>
>>>>>>>>>>>>> /Yumin
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/25/2013 3:32 PM, Yumin Qi wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   New webrev to use "-XX:+AssumMP" flag. Also add warning to
>>>>>>>>>>>>>> recommend
>>>>>>>>>>>>>> use this flag with "-XX:ParallelGCThreads" to remind user to
>>>>>>>>>>>>>> avoid
>>>>>>>>>>>>>> running with only one GC thread. This is verified by
>>>>>>>>>>>>>> Oleksandr with
>>>>>>>>>>>>>> the test case running on Linux which is not Zone configured.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   Same link.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/20/2013 2:27 PM, Yumin Qi wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2178143:  VM crashes if the number of bound CPUs changed
>>>>>>>>>>>>>>> during
>>>>>>>>>>>>>>> runtime.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Situation: Customer first configure only one CPU online and
>>>>>>>>>>>>>>> turn
>>>>>>>>>>>>>>> others offline to run java application, after java program
>>>>>>>>>>>>>>> started,
>>>>>>>>>>>>>>> bring more CPUs back online. Since VM started on a single
>>>>>>>>>>>>>>> CPU,
>>>>>>>>>>>>>>> os::is_MP() will return false, but after more CPUs
>>>>>>>>>>>>>>> available, OS
>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>> schedule the app run on multiple CPUs, this caused SEGV in
>>>>>>>>>>>>>>> various
>>>>>>>>>>>>>>> places where data consistency was broken. The solution is
>>>>>>>>>>>>>>> supply a
>>>>>>>>>>>>>>> flag to assume it is running on MP, so lock is forced to be
>>>>>>>>>>>>>>> called.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>


More information about the hotspot-gc-dev mailing list