[11u] RFR: 8244214: Add paddings for TaskQueuSuper to reduce false-sharing cache contention

Lindenmaier, Goetz goetz.lindenmaier at sap.com
Tue Jun 30 07:45:47 UTC 2020


Hi Patrick,

This looks good now, and also our builds all passed.
I'll sponsor it.

Best regards,
Goetz

PS: Sorry for the mail flood yesterday, they finally all showed
up in the archive.

From: Patrick Zhang OS <patrick at os.amperecomputing.com>
Sent: Tuesday, June 30, 2020 5:41 AM
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; jdk-updates-dev at openjdk.java.net
Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject: RE: [11u] RFR: 8244214: Add paddings for TaskQueuSuper to reduce false-sharing cache contention

Thanks for finding out it.

I updated the patch: http://cr.openjdk.java.net/~qpzhang/8248214/webrev.04/jdk11u-dev.changeset.

Yes, jdk/jdk uses sizeof(uint), and placing the two variables side-by-side can remind people in case of type changes in future. I don't have mac system, so thanks a lot again for having it run in your CI.

Regards
Patrick

From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
Sent: Tuesday, June 30, 2020 12:26 AM
To: Patrick Zhang OS <patrick at os.amperecomputing.com<mailto:patrick at os.amperecomputing.com>>; jdk-updates-dev at openjdk.java.net<mailto:jdk-updates-dev at openjdk.java.net>
Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>>
Subject: RE: [11u] RFR: 8244214: Add paddings for TaskQueuSuper to reduce false-sharing cache contention

Hi Patrick,

If you use sizeof(uint) it works on mac. Uint is also the
term jdk/jdk uses here.
I put it into our CI again to make sure all platforms build.
I'll  update you tomorrow (or ping me if I forget).

Also I think we should move the line below the
declaration of _bottom, as it now depends on the
type used there.

Best regards,
Goetz.

diff --git a/src/hotspot/share/gc/shared/taskqueue.hpp b/src/hotspot/share/gc/shared/taskqueue.hpp
--- a/src/hotspot/share/gc/shared/taskqueue.hpp
+++ b/src/hotspot/share/gc/shared/taskqueue.hpp
@@ -113,6 +113,8 @@

   // The first free element after the last one pushed (mod N).
   volatile uint _bottom;
+  // Add paddings to reduce false-sharing cache contention between _bottom and _age
+  DEFINE_PAD_MINUS_SIZE(0, DEFAULT_CACHE_LINE_SIZE, sizeof(uint));

   enum { MOD_N_MASK = N - 1 };


From: Patrick Zhang OS <patrick at os.amperecomputing.com<mailto:patrick at os.amperecomputing.com>>
Sent: Saturday, June 27, 2020 11:33 AM
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>; jdk-updates-dev at openjdk.java.net<mailto:jdk-updates-dev at openjdk.java.net>
Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>>
Subject: RE: [11u] RFR: 8244214: Add paddings for TaskQueuSuper to reduce false-sharing cache contention

Thanks Goetz

I updated the of reviewers, http://cr.openjdk.java.net/~qpzhang/8248214/webrev.02/jdk11u-dev.changeset. Regarding the performance, I had tests on Linux system with a couple of x86_64/aarch64 servers, I am not sure if mentioning specjbb here would be appropriate, by far, most results of this benchmark are positive especially the metrics sensitive to GC stability (G1 or ParallelGC), and no obvious change with others probably due to microarchitecture level differences in handling exclusive load/store. This is similar as the original patch [1].

Updated "Fix request (11u)" with a risk estimation of this downporting, see JBS [1] please.

I am not familiar with the process of jdk-updates. Is it ok to push this downporting patch now? or I should still wait for maintainer's approval at JBS (jdk11u-fix-yes?).

[1] https://bugs.openjdk.java.net/browse/JDK-8248214?focusedCommentId=14349531&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14349531


Regards

Patrick



-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
Sent: Friday, June 26, 2020 3:17 PM
To: Patrick Zhang OS <patrick at os.amperecomputing.com<mailto:patrick at os.amperecomputing.com>>; jdk-updates-dev at openjdk.java.net<mailto:jdk-updates-dev at openjdk.java.net>
Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>>
Subject: RE: [11u] RFR: 8244214: Add paddings for TaskQueueSuper to reduce false-sharing cache contention



Hi Patrick,



I had a look at your change.

I think it makes sense to bring this to 11, if there actually is the performance gain you mention.

Reviewed.



Please add in the "Fix request" comment in the JBS the risk of downporting this.  And I think is should be "Fix request (11u)"

because different people will review your fix request for 11 and 8.



Best regards,

  Goetz.



> -----Original Message-----

> From: jdk-updates-dev <jdk-updates-dev-bounces at openjdk.java.net<mailto:jdk-updates-dev-bounces at openjdk.java.net>> On

> Behalf Of Patrick Zhang OS

> Sent: Wednesday, June 24, 2020 11:55 AM

> To: jdk-updates-dev at openjdk.java.net<mailto:jdk-updates-dev at openjdk.java.net>

> Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>>

> Subject: [DMARC FAILURE] [11u] RFR: 8244214: Add paddings for

> TaskQueueSuper to reduce false-sharing cache contention

>

> Hi

>

> Could I ask for a review of this simple patch which takes a tiny part

> from the original ticket JDK-8243326 [1]. The reason that I do not

> want a full backport is, the majority of the patch at jdk/jdk [2] is

> to clean up the volatile use and may be not very meaningful to 11u,

> furthermore the context (dependencies on atomic.hpp refactor) is too

> complicated to generate a clear backport (I tried, ~81 files need to be changed).

>

> The purpose of having this one-line change to 11u is, the two volatile

> variables in TaskQueueSuper: _bottom, _age and corresponding atomic

> operations upon, may cause severe cache contention inside GC with

> larger number of threads, i.e., specified by -XX:ParallelGCThreads=##,

> adding paddings (up to DEFAULT_CACHE_LINE_SIZE) in-between can reduce

> the possibility of false-sharing cache contention. I do not need the

> paddings before _bottom and after _age from the original patch [2],

> because the instances of TaskQueueSuper are usually (always) allocated

> in a set of queues, in which they are naturally separated. Please review, thanks.

>

> JBS: https://bugs.openjdk.java.net/browse/JDK-8248214

> Webrev: http://cr.openjdk.java.net/~qpzhang/8248214/webrev.01/

> Testing: tier1-2 pass with the patch, commercial benchmarks and small

> C++ test cases (to simulate the data struct and work-stealing

> algorithm atomics) validated the performance, no regression.

>

> By the way, I am going to request for 8u backport as well once 11u

> would have it.

>

> [1] https://bugs.openjdk.java.net/browse/JDK-8243326 Cleanup use of

> volatile in taskqueue code [2]

> https://hg.openjdk.java.net/jdk/jdk/rev/252a1602b4c6

>

> Regards

> Patrick

>




More information about the hotspot-gc-dev mailing list