RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime

Karen Kinnear karen.kinnear at oracle.com
Tue Mar 26 07:57:29 PDT 2013


Yumin,

Looks good. Thank you for simplifying the approach.

thanks,
Karen

On Mar 26, 2013, at 8:50 AM, Daniel D. Daugherty wrote:

> > new webrev: http://cr.openjdk.java.net/~minqi/2178143/
> 
> Thumbs up.
> 
> src/share/vm/runtime/arguments.cpp
>    No comments.
> 
> src/share/vm/runtime/globals.hpp
>    If I followed the discussion over the default setting correctly,
>    the default is false because we don't want to change the behavior
>    for the single CPU case. I'm guessing that embedded is the primary
>    use case for single CPU configs. We can (and should) revisit this
>    in the future.
> 
> src/share/vm/runtime/os.hpp
>    No comments.
> 
> Dan
> 
> 
> On 3/26/13 12:21 AM, Yumin Qi wrote:
>> You are right, thanks!  I misread the INCLUDE.
>> new web in the same link.
>> 
>> Thanks
>> Yumin
>> 
>> On 3/25/2013 11:00 PM, David Holmes wrote:
>>> On 26/03/2013 3:56 PM, Yumin Qi wrote:
>>>> David,
>>>> 
>>>> On 3/25/2013 10:41 PM, David Holmes wrote:
>>>>> On 26/03/2013 3:30 PM, Yumin Qi wrote:
>>>>>> David and All,
>>>>>> 
>>>>>> Changed as suggested by David.
>>>>>> 
>>>>>> David, I did not move the warning into set_parallel_gc_flags() since
>>>>>> except for serial gc, all other GCs will check this flag, so I moved it
>>>>>> into
>>>>>> #if INCLUDE_ALL_GCS
>>>>>> 
>>>>>> before call set GC flags.
>>>>> 
>>>>> I didn't realize all the other GCs would utilize ParallelGCThreads.
>>>>> Can we at least exclude it for serial GC case?
>>>>> 
>>>> 
>>>> Serial GC is set here:
>>> 
>>> INCLUDE_ALL_GCS allows all GCs including Serial GC. Otherwise we only have SerialGC. So you have excluded it for the case where SerialGC is the only GC built into the VM, but not for the case where all GCs are available and the user requests UseSerialGC.
>>> 
>>> David
>>> -----
>>> 
>>> 
>>>> 3253 #if !INCLUDE_ALL_GCS
>>>> 3254   force_serial_gc();
>>>> 3255 #endif // INCLUDE_ALL_GCS
>>>> 
>>>> So after it moved into
>>>> 3283 #if INCLUDE_ALL_GCS
>>>> 3284   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>> 3285     warning("If the number of processors is expected to increase
>>>> from one, then"
>>>> 3286             " you should configure the number of parallel GC
>>>> threads appropriately"
>>>> 3287             " using -XX:ParallelGCThreads=N");
>>>> 3288   }
>>>> 3289   // Set per-collector flags
>>>> 3290   if (UseParallelGC || UseParallelOldGC) {
>>>> 3291     set_parallel_gc_flags();
>>>> 3292   } else if (UseConcMarkSweepGC) { // should be done before ParNew
>>>> check below
>>>> 3293     set_cms_and_parnew_gc_flags();
>>>> 3294   } else if (UseParNewGC) {  // skipped if CMS is set above
>>>> 3295     set_parnew_gc_flags();
>>>> 3296   } else if (UseG1GC) {
>>>> 3297     set_g1_gc_flags();
>>>> 3298   }
>>>> 3299   check_deprecated_gcs();
>>>> 3300 #else // INCLUDE_ALL_GCS
>>>> 3301   assert(verify_serial_gc_flags(), "SerialGC unset");
>>>> 3302 #endif // INCLUDE_ALL_GCS
>>>> 
>>>> It is excluded for serial GC.
>>>> 
>>>> Thanks
>>>> Yumin
>>>> 
>>>>> Otherwise looks good.
>>>>> 
>>>>> Thanks,
>>>>> David
>>>>> 
>>>>>> new webrev: http://cr.openjdk.java.net/~minqi/2178143/
>>>>>> 
>>>>>> Thanks
>>>>>> Yumin
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 3/25/2013 5:44 PM, David Holmes wrote:
>>>>>>> Hi Yumin,
>>>>>>> 
>>>>>>> I have a few suggested changes.
>>>>>>> 
>>>>>>> First in os.hpp I suggest
>>>>>>> 
>>>>>>> return _processor_count > 1 || AssumeMP ;
>>>>>>> 
>>>>>>> That way we don't bypass the assertion and don't encounter unnecessary
>>>>>>> overhead on the common path.
>>>>>>> 
>>>>>>> In arguments.cpp:
>>>>>>> 
>>>>>>> +   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>> +     warning("With AssumeMP, recommend run with
>>>>>>> -XX:ParallelGCThreads=<num_of_gc_threads>, where"
>>>>>>> +             "  num_of_gc_threads can be set to number of cores");
>>>>>>> +   }
>>>>>>> 
>>>>>>> I'm not convinced issuing a warning is necessary or desirable here.
>>>>>>> With a uniprocessor startup will we even default to using parallel GC?
>>>>>>> The above should only be issued if using parallel GC - so better to
>>>>>>> move it into set_parallel_gc_flags().
>>>>>>> 
>>>>>>> In any case it isn't clear what the user should supply here. If they
>>>>>>> use the maximum expected number of cores initially then while they
>>>>>>> only have 1 core their VM may not even function adequately due to GC
>>>>>>> overhead. I think all you can say here is something like:
>>>>>>> 
>>>>>>> warning("If the number of processors is expected to increase from one,
>>>>>>> then you should configure the number of parallel GC threads
>>>>>>> appropriately using -XX:ParallelGCThreads=N");
>>>>>>> 
>>>>>>> 
>>>>>>> In globals.hpp:
>>>>>>> 
>>>>>>> +   product(bool, AssumeMP, false,       \
>>>>>>> +           "Assume run on multiple processors always") \
>>>>>>> 
>>>>>>> Was there any discussion on making this default to true instead?
>>>>>>> 
>>>>>>> Suggested re-wording:
>>>>>>> 
>>>>>>> "Instruct the VM to assume multiple processors are available"
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> 
>>>>>>> On 26/03/2013 9:15 AM, Yumin Qi wrote:
>>>>>>>> It should be "AssumeMP".
>>>>>>>> 
>>>>>>>> /Yumin
>>>>>>>> 
>>>>>>>> On 3/25/2013 3:32 PM, Yumin Qi wrote:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>>  New webrev to use "-XX:+AssumMP" flag. Also add warning to
>>>>>>>>> recommend
>>>>>>>>> use this flag with "-XX:ParallelGCThreads" to remind user to avoid
>>>>>>>>> running with only one GC thread. This is verified by Oleksandr with
>>>>>>>>> the test case running on Linux which is not Zone configured.
>>>>>>>>> 
>>>>>>>>>  Same link.
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> Yumin
>>>>>>>>> 
>>>>>>>>> On 3/20/2013 2:27 PM, Yumin Qi wrote:
>>>>>>>>>> 
>>>>>>>>>> 2178143:  VM crashes if the number of bound CPUs changed during
>>>>>>>>>> runtime.
>>>>>>>>>> 
>>>>>>>>>> Situation: Customer first configure only one CPU online and turn
>>>>>>>>>> others offline to run java application, after java program started,
>>>>>>>>>> bring more CPUs back online. Since VM started on a single CPU,
>>>>>>>>>> os::is_MP() will return false, but after more CPUs available, OS
>>>>>>>>>> will
>>>>>>>>>> schedule the app run on multiple CPUs, this caused SEGV in various
>>>>>>>>>> places where data consistency was broken. The solution is supply a
>>>>>>>>>> flag to assume it is running on MP, so lock is forced to be called.
>>>>>>>>>> 
>>>>>>>>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>> 
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> 
> 



More information about the hotspot-gc-dev mailing list