Fwd: Better default for ParallelGCThreads and ConcGCThreads by using number of physical cores and CPU mask.
david.holmes at oracle.com
Sun Nov 24 18:19:19 PST 2013
On 23/11/2013 3:24 AM, Jon Masamitsu wrote:
> This is a contribution regarding the number of GC worker threads to
> use. Part of the change queries /proc on linux to get the number of
> active cores on the platform. The changes are in
> Can someone familiar with this code take a look to see
> if it is reasonable and done in a way that is consistent
> with other /proc queries.
I can't comment on that specifically but I do have reservations about
this proposed patch.
First we have a general problem that "active processor count" doesn't
take into account the various resource management mechanisms that can
limit the actual "processors" available to the VM when it is running. I
would prefer to see that general problem solved. It also isn't clear to
me that the sched_getaffinity usage will correctly reflect the use of
tasksets/cpusets. (Note on solaris we try to handle some of these
mechanisms eg pbind and psrsets but still don't handle resource pools.)
Second, this feeds into future work on NUMA-awareness that will likely
need a more sophisticated set of API's.
Third I dislike that this is only really addressing linux-x86 and
leaving the other platforms to default to cores==processors. That just
causes unnecessary divergence in platform functionality.
This is too late for JDK 8 and I think we will be doing more complete
work in this area during JDK 9 development.
> -------- Original Message --------
> Subject: Better default for ParallelGCThreads and ConcGCThreads by
> using number of physical cores and CPU mask.
> Date: Tue, 19 Nov 2013 15:35:22 -0800
> From: Jungwoo Ha <jwha at google.com>
> To: hotspot-gc-dev at openjdk.java.net
> I am sending this webrev for the review.
> (On behalf of Jon Masamitsu, it is upload here)
> The feature is a new heuristics to calculate the default
> ParallelGCThreads and ConGCThreads.
> In x86, hyperthreading is generally bad for GC because of the cache
> Hence, using all the hyper-threaded cores will slow down the overall GC
> Current hotspot reads the number of processors that the Linux reports,
> which treats all hyper-threaded cores equally.
> Second problem is that when cpu mask is set, not all the cores are
> available for the GC.
> The patch improves the heuristics by evaluating the actual available
> physical cores
> from the proc filesystem and the CPU mask, and use that as the basis for
> calculating the ParallelGCThreads and ConcGCThreads.
> The improvements of GC pause time is significant. We evaluated on
> Nehalem, Westmere, Sandybridge as well as several AMD processors. We
> also evaluated on various CPU mask configuration and single/dual socket
> In almost all cases, there were speed up in GC pause time by 10~50%.
> We primarily use CMS collector for the evaluation, but we also tested on
> other GCs as well.
> Please take a look and let me know if this patch can be accepted.
> Jungwoo Ha
More information about the hotspot-runtime-dev