RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime

Yumin Qi yumin.qi at oracle.com
Tue Mar 26 09:11:40 PDT 2013


Jon,

On 3/26/2013 8:03 AM, Jon Masamitsu wrote:
> If I set AssumeMP on the command line and nothing else
> and am running on a platform where I get ParallelGCThreads > 1,
> am I going to see the warning?    If yes, that seems too intrusive.
> I can see users hearing about the flag and deciding to include
> it always just to be safe.
>
Good question, let me check that.

Thanks
Yumin

>
> On 3/25/13 11:21 PM, Yumin Qi wrote:
>> You are right, thanks!  I misread the INCLUDE.
>> new web in the same link.
>>
>> Thanks
>> Yumin
>>
>> On 3/25/2013 11:00 PM, David Holmes wrote:
>>> On 26/03/2013 3:56 PM, Yumin Qi wrote:
>>>> David,
>>>>
>>>> On 3/25/2013 10:41 PM, David Holmes wrote:
>>>>> On 26/03/2013 3:30 PM, Yumin Qi wrote:
>>>>>> David and All,
>>>>>>
>>>>>> Changed as suggested by David.
>>>>>>
>>>>>> David, I did not move the warning into set_parallel_gc_flags() since
>>>>>> except for serial gc, all other GCs will check this flag, so I 
>>>>>> moved it
>>>>>> into
>>>>>> #if INCLUDE_ALL_GCS
>>>>>>
>>>>>> before call set GC flags.
>>>>>
>>>>> I didn't realize all the other GCs would utilize ParallelGCThreads.
>>>>> Can we at least exclude it for serial GC case?
>>>>>
>>>>
>>>> Serial GC is set here:
>>>
>>> INCLUDE_ALL_GCS allows all GCs including Serial GC. Otherwise we 
>>> only have SerialGC. So you have excluded it for the case where 
>>> SerialGC is the only GC built into the VM, but not for the case 
>>> where all GCs are available and the user requests UseSerialGC.
>>>
>>> David
>>> -----
>>>
>>>
>>>> 3253 #if !INCLUDE_ALL_GCS
>>>> 3254   force_serial_gc();
>>>> 3255 #endif // INCLUDE_ALL_GCS
>>>>
>>>> So after it moved into
>>>> 3283 #if INCLUDE_ALL_GCS
>>>> 3284   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>> 3285     warning("If the number of processors is expected to increase
>>>> from one, then"
>>>> 3286             " you should configure the number of parallel GC
>>>> threads appropriately"
>>>> 3287             " using -XX:ParallelGCThreads=N");
>>>> 3288   }
>>>> 3289   // Set per-collector flags
>>>> 3290   if (UseParallelGC || UseParallelOldGC) {
>>>> 3291     set_parallel_gc_flags();
>>>> 3292   } else if (UseConcMarkSweepGC) { // should be done before 
>>>> ParNew
>>>> check below
>>>> 3293     set_cms_and_parnew_gc_flags();
>>>> 3294   } else if (UseParNewGC) {  // skipped if CMS is set above
>>>> 3295     set_parnew_gc_flags();
>>>> 3296   } else if (UseG1GC) {
>>>> 3297     set_g1_gc_flags();
>>>> 3298   }
>>>> 3299   check_deprecated_gcs();
>>>> 3300 #else // INCLUDE_ALL_GCS
>>>> 3301   assert(verify_serial_gc_flags(), "SerialGC unset");
>>>> 3302 #endif // INCLUDE_ALL_GCS
>>>>
>>>> It is excluded for serial GC.
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>>> Otherwise looks good.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> new webrev: http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/25/2013 5:44 PM, David Holmes wrote:
>>>>>>> Hi Yumin,
>>>>>>>
>>>>>>> I have a few suggested changes.
>>>>>>>
>>>>>>> First in os.hpp I suggest
>>>>>>>
>>>>>>>  return _processor_count > 1 || AssumeMP ;
>>>>>>>
>>>>>>> That way we don't bypass the assertion and don't encounter 
>>>>>>> unnecessary
>>>>>>> overhead on the common path.
>>>>>>>
>>>>>>> In arguments.cpp:
>>>>>>>
>>>>>>> +   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>> +     warning("With AssumeMP, recommend run with
>>>>>>> -XX:ParallelGCThreads=<num_of_gc_threads>, where"
>>>>>>> +             "  num_of_gc_threads can be set to number of cores");
>>>>>>> +   }
>>>>>>>
>>>>>>> I'm not convinced issuing a warning is necessary or desirable here.
>>>>>>> With a uniprocessor startup will we even default to using 
>>>>>>> parallel GC?
>>>>>>> The above should only be issued if using parallel GC - so better to
>>>>>>> move it into set_parallel_gc_flags().
>>>>>>>
>>>>>>> In any case it isn't clear what the user should supply here. If 
>>>>>>> they
>>>>>>> use the maximum expected number of cores initially then while they
>>>>>>> only have 1 core their VM may not even function adequately due 
>>>>>>> to GC
>>>>>>> overhead. I think all you can say here is something like:
>>>>>>>
>>>>>>> warning("If the number of processors is expected to increase 
>>>>>>> from one,
>>>>>>> then you should configure the number of parallel GC threads
>>>>>>> appropriately using -XX:ParallelGCThreads=N");
>>>>>>>
>>>>>>>
>>>>>>> In globals.hpp:
>>>>>>>
>>>>>>> +   product(bool, AssumeMP, false,       \
>>>>>>> +           "Assume run on multiple processors always") \
>>>>>>>
>>>>>>> Was there any discussion on making this default to true instead?
>>>>>>>
>>>>>>> Suggested re-wording:
>>>>>>>
>>>>>>>  "Instruct the VM to assume multiple processors are available"
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>> On 26/03/2013 9:15 AM, Yumin Qi wrote:
>>>>>>>> It should be "AssumeMP".
>>>>>>>>
>>>>>>>> /Yumin
>>>>>>>>
>>>>>>>> On 3/25/2013 3:32 PM, Yumin Qi wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>   New webrev to use "-XX:+AssumMP" flag. Also add warning to
>>>>>>>>> recommend
>>>>>>>>> use this flag with "-XX:ParallelGCThreads" to remind user to 
>>>>>>>>> avoid
>>>>>>>>> running with only one GC thread. This is verified by Oleksandr 
>>>>>>>>> with
>>>>>>>>> the test case running on Linux which is not Zone configured.
>>>>>>>>>
>>>>>>>>>   Same link.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Yumin
>>>>>>>>>
>>>>>>>>> On 3/20/2013 2:27 PM, Yumin Qi wrote:
>>>>>>>>>>
>>>>>>>>>> 2178143:  VM crashes if the number of bound CPUs changed during
>>>>>>>>>> runtime.
>>>>>>>>>>
>>>>>>>>>> Situation: Customer first configure only one CPU online and turn
>>>>>>>>>> others offline to run java application, after java program 
>>>>>>>>>> started,
>>>>>>>>>> bring more CPUs back online. Since VM started on a single CPU,
>>>>>>>>>> os::is_MP() will return false, but after more CPUs available, OS
>>>>>>>>>> will
>>>>>>>>>> schedule the app run on multiple CPUs, this caused SEGV in 
>>>>>>>>>> various
>>>>>>>>>> places where data consistency was broken. The solution is 
>>>>>>>>>> supply a
>>>>>>>>>> flag to assume it is running on MP, so lock is forced to be 
>>>>>>>>>> called.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>
>



More information about the hotspot-gc-dev mailing list