[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request

B. Blaser bsrbnd at gmail.com
Mon Feb 18 16:09:45 UTC 2019

On Mon, 18 Feb 2019 at 16:37, Andrew Haley <aph at redhat.com> wrote:
> On 2/18/19 1:26 PM, B. Blaser wrote:
> >
> > Intrinsic instruction sequences are definitely fast and other
> > optimizations can benefit from their mathematical properties.
> Yes, they can be.
> > Of course, statistical optimizations could be even faster but making
> > assumptions about predictability to exclude intrinsics is rather
> > dangerous.
> I'm not convinced that it is at all dangerous. The pattern I
> illustrated is uncommon, and might will be considerably more common
> than the pattern than the benchmark presented by Jatin. But we should
> not choose our benchmarks so that they make our code look
> good. Instead, we should use benchmarks to help us decide what to do.
> > The JVM should be able to decide dynamically whether to use intrinsics
> > or not depending on the reliability of its statistics?!
> Perhaps so, yes. So before we decide to commit changes that may well make the
> JVM worse on many (most?) workloads, we should find a way to do that.

Yes and no, simply try your example with unfavourable data:

public class FpMinMaxIntrinsics {
    private static final int COUNT = 1000;

    private float[] floats = new float[COUNT];

    private Random r = new Random();

    public void init() {
        for (int i=0; i<COUNT; i++) {
            if (i % 2 == 0)
                floats[i] = r.nextFloat();
                floats[i] = -0.0f;

    public float fMinReduce() {
        float result = Float.MAX_VALUE;

        for (int i=0; i<COUNT; i++)
            result = Math.min(result, floats[i]);

        return result;

With the intrinsic:

Benchmark                      Mode  Cnt     Score   Error  Units
FpMinMaxIntrinsics.fMinReduce  avgt       2386.708          ns/op


Benchmark                      Mode  Cnt      Score   Error  Units
FpMinMaxIntrinsics.fMinReduce  avgt       14042.155          ns/op

The execution time of the intrinsic will always be stable and you'll
never have such performance drop-down.


More information about the hotspot-compiler-dev mailing list