8152312: ParNew: Restore preserved marks in parallel
thomas.schatzl at oracle.com
Fri Mar 25 12:56:49 UTC 2016
On Thu, 2016-03-24 at 11:36 -0400, Tony Printezis wrote:
> Thanks for the feedback. Updated webrev here:
> - Your main comment was whether to make the serial vs. parallel
> decision in DefNew or in restore(). I liked the suggestion of either
> passing NULL or the workers to restore(WorkGang*) and deciding in
> restore(…) what to do. The only problem is that ParallelGC doesn’t
> use a WorkGang, it uses a GCTaskManager instead.
> And anything we do in restore(...) would be nice to also take
> advantage of in ParallelGC. So, I came up with a cunning scheme to do
> that. Let me know what you think. :-) Yes, it works fine with
> ParallelGC too (I already implemented/tested the changes, I’ll open
> them for code review once this is done.) I left a restore() method
> there temporarily to be used by ParallelGC but it will go away.
Some specialized implementation of the method based on the type of E I
Note that we now basically duplicate the restore() method, it gets
close to just have two (parallel gc, the rest) implementations of the
class. Let's see your solution.
Since I do not know the actual implementation of your scheme, but it
would be nice if the code between lines 65 and 68 in
preservedMarks.inline.hpp were moved into a method too, particularly if
the future implementation also has a single-threaded path.
I would prefer to have the implementation(s) of restore() in the cpp
file(s) though (and possibly hide the one for parallel somewhere very
far away ;) ). The additional call in this case does not seem to be
problematic vs. performance.
> - Yes, if each stack has a small number of entries we’re better off
> doing the restoration serially instead of firing up the workers. This
> will be a small localized change to what I have in the webrev (we’ll
> only have to change restore(E*);
One comment here: the comment at
70 // Right now, if the executor is not NULL we do the work in
71 // parallel. In the future we might want to do the restoration
72 // serially, if there's only a small number of marks per
Imo is better placed before line 64, cut a lot to something like:
// Decide on the amount of parallelism.
Please add an RFE in the bug tracker explaining the situation about "in
the future" ideas. That would be much better than dumping your thoughts
in some random place in the code.
> the main question will be what policy to use!). I'll do a follow-up
> to this too.
The main goal is not to be perfect imo, but to determine a reasonable
somewhat conservative upper bound on the amount of threads used. Too
many people use too large machines on small problems.
One could argue, that work units < 1ms (or another low value) are not
worth distributing. From that one could determine a "reasonable"
minimum number of references to process per thread, from which you
could then deduct a range of PreservedMarks buffers that each thread
could use (assuming that the actual references are evenly distributed
across these sets of references).
At least that's what I would start with.
> - Ah, yes, I’m glad there’s an atomic add for size_t in JDK 9. :-)
> - Re: ParRestoreTask vs. RestoreTask : I tend to add “Par” as the
> tasks should be safe to use in parallel (even if in practice they are
> not). Hope that’s OK.
Another reason is that nowadays practically everything done during a
STW pause must somehow be implemented to run in parallel to be actually
useful (except for serial gc obviously). So this marker is kind of
misleading and redundant.
Not really insisting on this, but it may be that some future general cleanup will modify this (nothing actually planned from me).
More information about the hotspot-gc-dev