RFR(XS): JDK-8212122: Allow ReferenceProcessor to always be MT processing
rkennke at redhat.com
Tue Oct 16 10:15:37 UTC 2018
The more I think about it, the more messy this occurs to me.
Again, to begin with, I believe we already have what we want: GCs can
tell the RP if it wants mt_processing or not via the mt_processing flag.
It used to be the case that RP always respected that. GCs set this to
false if num-gc-treads==1 or if it didn't want mt_processing for other
reasons (serial GC), otherwise to true, and RP would always use the
executor if mt_processing==true, otherwise go the single-threaded path.
With the change of using ergonomically determined num-workers, this has
changed to dynamically use one or the other path, which is not good for
GCs that actually always want mt_processing. I find this surprising
behaviour. I would argue that when GC tells RP to to mt_processing, then
yes, please always go the executor/mt-processing route. The executor is
free to decide to not spin up workers for 1-threaded-execution as
optimization. Other than that, I believe that 1-threaded and N-threaded
execution should be equivalent.
For this reason, I still think that simply removing this surprising
adjustment of _mt_processing is the way to go, as proposed here:
If we really need to preserve the old (actually, rather new) behaviour
to execute single-threaded using the specific single-threaded setup, we
can do as proposed earlier:
and GC can override this by passing a suitable RP subclass.
As much as I'd like to fix this mess, and care about CMS and ParallelGC,
I currently don't have time for that (maybe later). What I need is a way
to avoid the single-threaded-path altogether, because it breaks Shenandoah.
>> On Oct 15, 2018, at 9:15 AM, Roman Kennke <rkennke at redhat.com> wrote:
>>> The problem is the (rather new) ergonomics code in RP that scales number
>>> of threads to spin up based on the amount of work to do. And if it ends
>>> up being only 1 thread, then it goes down the single-threaded-path.
>>> Otherwise I wouldn't actually need this change... (that might be another
>>> approach to this problem: don't actually change _mt_processing flag at
>>> all, and call the executor even with 1 thread. In Shenandoah, we do this
>>> and it works fine, and we special-case the 1-thread-path by not spinning
>>> up the workers and execute it in the calling thread (which is the VMThread).
>> In other words, this is a fix that actually removes 'cruft', and seems
>> less surprising API-wise: i.e. if caller asks for mt_processing, then
>> don't change it, and leave it to caller to execute in VMThread if it wishes.
>> Updated webrev:
>> An example how G1 can be made to strictly do the current behaviour (i.e.
>> execute in VMThread when ergo_workers==1):
>> ... altough I haven't tested it at all. It works well with Shenandoah,
>> but I don't know if you'd actually do it/want it/ask me to do it ;-)
>> What do you think? Which approach do you like best (except the huge
>> parallel/CMS/WorkGang/ReferenceProcessor reworking)?
> Sorry I haven't had time to give this much attention yet; I was
> planning to look at it today or tomorrow. But your latest comments
> (which I might not fully understand yet, so keep that in mind) are
> leading me to question this change.
> Putting it on callers of run_task to do the 1-thread optimization
> doesn't seem appropriate. I recently had a change that included that
> sort of thing, and Thomas convinced me that decision ought to be in
> run_task, rather than in callers of run_task. I think that hasn't
> happened yet because of a combination of nobody getting around to it
> and wondering whether there might be any uses where running the task
> directly from the VMThread might be a problem.
> Maybe the selection of whether to run the 1-thread case directly or
> via the executor should be a property of the executor? That seems to
> be kind of what you are proposing, except I think you are suggesting
> the executor's execute function be responsible for that. I think that
> didn't work for CMS and Parallel? Sangheon probably remembers better
> than exactly what the issues were there. The choice could instead be
> a predicate on the executor, called with the number of threads.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: OpenPGP digital signature
More information about the hotspot-gc-dev