RFR: 8232782: Shenandoah: streamline post-LRB CAS barrier (aarch64)
kdnilsen at amazon.com
Fri Jun 26 20:21:10 UTC 2020
Is there consensus that we should use CAS instruction instead of ldxr/stxr?
Presumably, there are some platforms where ldxr/stxr performs better than CAS, or at least there is the potential that such would exist.
Perhaps the JIT and run-time should adjust their behavior depending on the host platform.
Perhaps the whole issue of which synchronization primitives to use should be addressed in a different ticket.
I am willing to rework this patch. Just need some clear guidance as to which direction to move it.
On 6/24/20, 8:28 AM, "Roman Kennke" <rkennke at redhat.com> wrote:
On Wed, 2020-06-24 at 16:22 +0100, Andrew Haley wrote:
> On 24/06/2020 15:48, Roman Kennke wrote:
> > On Wed, 2020-06-24 at 15:29 +0100, Andrew Haley wrote:
> > > On 24/06/2020 14:54, Nilsen, Kelvin wrote:
> > > > Is this ok to merge?
> > >
> > > One thing:
> > >
> > > Some CPUs, in particular those based on Neoverse N1, can perform
> > > very
> > > badly when using ldxr/stxr. For that reason, all code doing CAS
> > >
> > > I can't see any reason why your code needs to use ldxr/stxr. Is
> > > there
> > > any?
> > As far as I know, Shenandoah's AArch64-CAS-implementation always
> > did it
> > that way (don't remember why). If regular CAS is generally better,
> > then
> > we should go for it.
> Does this algorithm need a full barrier even when CAS fails?
We need to do extra work *only* when CAS fails. We need to catch false
negatives -- when the compare-value is to-space (that's guaranteed) and
the value in memory is from-space copy of the same object.
More information about the hotspot-gc-dev