[9] RFR(S): 8064940: JMH javac performance regressions on solaris-sparcv9 in 9-b34

Tobias Hartmann tobias.hartmann at oracle.com
Thu Jan 15 08:58:22 UTC 2015


please review the following patch.


Promotion testing revealed a performance regression for the JMH-Javac benchmarks
on Solaris Sparc introduced in b34 by JDK-8015774. While investigating, I
noticed that the number of iTLB misses greatly increases with code cache
segmentation enabled (40190 vs. 129806235) causing the regression. This is due
to large page support (-XX:+UseLargePages) being enabled on Sparc.

Without code cache segmentation the single code heap uses only large (4M) pages:

Address              Kbytes        RSS       Anon     Locked Pgsz Mode
FFFFFFFF69000000      32768      32768      32768          -   4M rwx--

iTLB misses: 40190 (one run)

With code cache segmentation the code heaps do not use large pages and due to
JDK-8066875 not even the middle region of the underlying virtual space uses
large pages:

Address              Kbytes        RSS       Anon     Locked Pgsz Mode
FFFFFFFF69000000       4544       4544       4544          -  64K rwx--

FFFFFFFF697CE000          8          8          8          -   8K rwx--
FFFFFFFF697D0000       1984       1984       1984          -  64K rwx--
FFFFFFFF699C0000      16456      16456      16456          -   8K rwx--
FFFFFFFF6A9D2000         48          -          -          -    - rwx--

FFFFFFFF70BE8000         32         32         32          -   8K rwx--
FFFFFFFF70BF0000       1984       1984       1984          -  64K rwx--
FFFFFFFF70DE0000      10040      10040      10040          -   8K rwx--
FFFFFFFF717AE000         40          -          -          -    - rwx--

iTLB misses: 129806235 (one run)

As a result a high number 8K and 64K pages are used to cover the code cache,
resulting in an increased number of iTLB misses, degrading performance.

By aligning the code heap sizes to the large page size we make sure that each
code heap can be covered by large pages:

Address              Kbytes        RSS       Anon     Locked Pgsz Mode
FFFFFFFF69000000       8192       8192       8192          -   4M rwx--
FFFFFFFF69800000      16384      16384      16384          -   4M rwx--
FFFFFFFF70C00000       4096       4096       4096          -   4M rwx--

iTLB misses: 40054 (one run)

I also had to adapt the 'print code cache' test because it assumes that the code
heap sizes set on the command line are equal the runtime sizes. This is not true
if we align them to large pages. There is an existing RFE for additional
alignment tests [1] that will cover this case.

Note: The fix depends on [2].

- Performance testing (see separate email)
- Manually tested on Windows with large pages enabled


[1] https://bugs.openjdk.java.net/browse/JDK-8067135
[2] https://bugs.openjdk.java.net/browse/JDK-8066875

More information about the hotspot-compiler-dev mailing list