RFR (L): 8058354: SPECjvm2008-Derby -2.7% performance regression on Solaris-X64 starting with 9-b29

Thomas Schatzl thomas.schatzl at oracle.com
Thu Jan 29 10:30:39 UTC 2015

Hi all,

  can I have reviews for the following change that fixes the use of
large pages for auxiliary data on G1?

In JDK-8038423 there has been a large change in how G1 handles virtual
memory, and overlooked that auxiliary data may use large pages. This
caused some performance regressions after introducing that build.

This change fixes this problem: G1 is now more flexible in using large
pages: particularly auxiliary data that often is not sized to multiples
of (large) page size suffers from that. By allowing the virtual space
implementation to use small pages on the tail (upper) end of the virtual
space, everything else can use large pages.

There is one limitation to that: the start address of the used virtual
spaces must be aligned to large page size to use large pages in
auxiliary data. This is to simplify commit and uncommit within the
regions, since very small areas in the auxiliary data can map to large
areas in the heap (e.g. for the BOT, at 4k page size, one page maps to
2M of memory, 2M pages map to 1G of memory).

The problem is, if the start address of such auxiliary data were not
aligned to requested page size, we would potentially need to split
neighbouring large pages if we tried to uncommit one.

I.e. some ascii art showing the problem.

AAAAAA AAAAAA AAAAAA  // heap area, each AAAAAA is a single region
   |      |      |    // area covered by auxilary pages
      1       2       // auxiliary data pages

So if auxiliary data pages were unaligned, so that they correspond to
uneven multiples of the heap, when uncommitting e.g. the second region
(second set of AAAAAA), we would have to split the auxiliary data pages
1 and 2 into smaller ones.

That does not seem to be a good tradeoff in complexity, given that the
waste is at most one large page in reserved space (and unfortunately,
due to the Linux large page implementation also in actually used space).

Changes in detail containing some additional fixes:
 - page selection corresponds to other collectors, i.e. if some
auxiliary data covers at least one large page, try to use large pages if
 - fix CMBitMap::compute_size() to align to alignment granularity (this
has not been a real problem because the actual size is always a multiple
of that)
 - allow (very restricted) mixed use of small and large pages in the G1
 - pass on alignment hints to os::commit()
 - some refactoring extracting out the code to reserve auxiliary memory

With these changes, performance when using large pages is at least as
good as before 9b29.



jprt, specjbb*, specjvm*, vm.quick.testlist, some large benchmarks, test


