linux os processor optimizations for OpenJDK GC performance enhancement
ramkri123 at gmail.com
Tue Apr 18 14:34:09 UTC 2017
Please find detailed proposal below, looking forward to your comments.
"Minimize application tail latency using cache-partitioning-aware G1GC" --
On Thu, Apr 13, 2017 at 11:04 PM, Bernd Eckenfels <ecki at zusammenkunft.net>
> Maybe it would be better to concentrate the processor optimizations on
> accessors and barrriers without introducing a completely new GC
> architecture. I can imagine that especially in the area of NUMA, TLAB, huge
> pages, cache consistency and possibly MMX extensions there is some
> Abandoning the global STW - while it seems like a pretty powerful change -
> is I guess not a good starter exercise. Especially since it is not only a
> question of mutator threads.
> *From:* hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> on
> behalf of Ram Krishnan <ramkri123 at gmail.com>
> *Sent:* Friday, April 14, 2017 6:36:27 AM
> *To:* Asif Qamar; Andrew Haley; hotspot-gc-dev at openjdk.java.net
> *Subject:* Re: linux os processor optimizations for OpenJDK GC
> performance enhancement
> Thanks Andrew.
> >>Surely there is: a thread could have its TLAB allocated from a region
> >>local to that socket (or core), and the GC thread for that region
> >>could run on the same socket. It only works for young gen, but that's
> >>a lot of the problem.
> A clarification -- does the TLAB allocation apply to tenured space also?
> If not, the above would work only for young gen cases where there is no
> promotion to tenured right?
> On Thu, Apr 13, 2017 at 12:55 PM, Ram Krishnan <ramkri123 at gmail.com>
>> ---------- Forwarded message ----------
>> Andrew Haley <aph at redhat.com>
>> Date: Thu, Apr 13, 2017 at 9:52 AM
>> Subject: Re: linux os processor optimizations for OpenJDK GC performance
>> hotspot-gc-dev at openjdk.java.net
>> On 13/04/17 16:33, Kim Barrett wrote:
>> > An application thread may touch memory in any region; there is no
>> > notion of a thread being "scoped" to a specific set of regions. While
>> > it might happen that a thread would only touch regions not being
>> > worked on by the collector, there is no a priori way to know that.
>> Surely there is: a thread could have its TLAB allocated from a region
>> local to that socket (or core), and the GC thread for that region
>> could run on the same socket. It only works for young gen, but that's
>> a lot of the problem.
More information about the hotspot-dev