RFR: Parallelize safepoint cleanup

David Holmes david.holmes at oracle.com
Fri Jun 2 01:54:36 UTC 2017

Hi Roman,

I am about to disappear on an extended vacation so will let others 
pursue this. IIUC this is longer an opt-in by the user at runtime, but 
an opt-in by the particular GC developers. Okay. My only concern with 
that is if Shenandoah is the only GC that currently opts in then this 
code is not going to get much testing and will be more prone to 
incidental breakage.


On 2/06/2017 2:21 AM, Roman Kennke wrote:
> Am 01.06.2017 um 17:50 schrieb Roman Kennke:
>> Am 01.06.2017 um 14:18 schrieb Robbin Ehn:
>>> Hi Roman,
>>> On 06/01/2017 11:29 AM, Roman Kennke wrote:
>>>> Am 31.05.2017 um 22:06 schrieb Robbin Ehn:
>>>>> Hi Roman, I agree that is really needed but:
>>>>> On 05/31/2017 10:27 AM, Roman Kennke wrote:
>>>>>> I realized that sharing workers with GC is not so easy.
>>>>>> We need to be able to use the workers at a safepoint during concurrent
>>>>>> GC work (which also uses the same workers). This does not only require
>>>>>> that those workers be suspended, like e.g.
>>>>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e. have
>>>>>> finished their tasks. This needs some careful handling to work without
>>>>>> races: it requires a SuspendibleThreadSetJoiner around the
>>>>>> corresponding
>>>>>> run_task() call and also the tasks themselves need to join the STS and
>>>>>> handle requests for safepoints not by yielding, but by leaving the
>>>>>> task.
>>>>>> This is far too peculiar for me to make the call to hook up GC workers
>>>>>> for safepoint cleanup, and I thus removed those parts. I left the
>>>>>> API in
>>>>>> CollectedHeap in place. I think GC devs who know better about G1
>>>>>> and CMS
>>>>>> should make that call, or else just use a separate thread pool.
>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05/>
>>>>>> Is it ok now?
>>>>> I still think you should put the "Parallel Safepoint Cleanup" workers
>>>>> inside Shenandoah,
>>>>> so the SafepointSynchronizer only calls get_safepoint_workers, e.g.:
>>>>> _cleanup_workers = heap->get_safepoint_workers();
>>>>> _num_cleanup_workers = _cleanup_workers != NULL ?
>>>>> _cleanup_workers->total_workers() : 1;
>>>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks);
>>>>> StrongRootsScope srs(_num_cleanup_workers);
>>>>> if (_cleanup_workers != NULL) {
>>>>>     _cleanup_workers->run_task(&cleanup, _num_cleanup_workers);
>>>>> } else {
>>>>>     cleanup.work(0);
>>>>> }
>>>>> That way you don't even need your new flags, but it will be up to the
>>>>> other GCs to make their worker available
>>>>> or cheat with a separate workgang.
>>>> I can do that, I don't mind. The question is, do we want that?
>>> The problem is that we do not want to haste such decision, we believe
>>> there is a better solution.
>>> I think you also would want another solution.
>>> But it's seems like such solution with 1 'global' thread pool either
>>> own by GC or the VM it self is quite the undertaking.
>>> Since this probably will not be done any time soon my suggestion is,
>>> to not hold you back (we also want this), just to make
>>> the code parallel and as an intermediate step ask the GC if it minds
>>> sharing it's thread.
>>> Now when Shenandoah is merged it's possible that e.g. G1 will share
>>> the code for a separate thread pool, do something of it's own or
>>> wait until the bigger question about thread pool(s) have been resolved.
>>> By adding a thread pool directly to the SafepointSynchronizer and
>>> flags for it we might limit our future options.
>>>> I wouldn't call it 'cheating with a separate workgang' though. I see
>>>> that both G1 and CMS suspend their worker threads at a safepoint.
>>>> However:
>>> Yes it's not cheating but I want decent heuristics between e.g. number
>>> of concurrent marking threads and parallel safepoint threads since
>>> they compete for cpu time.
>>> As the code looks now, I think that decisions must be made by the GC.
>> Ok, I see your point. I updated the proposed patch accordingly:
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.06/>
> Oops. Minor mistake there. Correction:
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.07/>
> (Removed 'class WorkGang' from safepoint.hpp, and forgot to add it into
> collectedHeap.hpp, resulting in build failure...)
> Roman

More information about the hotspot-gc-dev mailing list