RFR(xs): 8202772: NMT erroneously assumes thread stack boundaries to be page aligned
thomas.stuefe at gmail.com
Tue Jun 12 16:21:16 UTC 2018
On Tue, Jun 12, 2018 at 3:11 PM, Zhengyu Gu <zgu at redhat.com> wrote:
> Hi Thomas,
> Looks fine as temporarily solution.
> On 06/12/2018 12:59 AM, Thomas Stüfe wrote:
>> Dear all,
>> may I please have reviews for this fix, which - for now - disables
>> thread stack tracking for NMT on AIX.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8202772
>> On AIX, we have two problems:
>> - NMT assumes stack boundaries to be page aligned. This is wrong but
>> on all platforms other than AIX does not matter.
> Because of this :-)
> The address specified in stackaddr should be suitably aligned: for
> full portability, align it on a page boundary
> (sysconf(_SC_PAGESIZE)). posix_memalign(3) may be useful for
> allocation. Probably, stacksize should also be a multiple of the
> system page size.
Ah yes. But this is the Linux manpage. Posix is more vague:
but allows the platform to have some unspecified alignment requirements.
>> - the way mincore() is used to read residency of pages needs to be
>> adapted since on AIX, os::vm_page_size() is not necessarily the page
>> size used by mincore() - which is quite dangerous.
> I did some digging, seems that AIX has two different type of pages. So which
> one will sysconf(_SC_PAGESIZE) return?
always 4K. But that is not always the actual page size. AIX is
annoying that way. The short version of this long and potentially
AIX has 4 page sizes, 4K, 64K, 16M and 16G.
16M, 16G are of no practical importance - using them follows similar
rules as huge non-transparent pages on Linux - you need a special
memory pool, they are pinned, you need special user permissions - we
tried that and pretty much every customer got it wrong so we gave up.
AIX allows for different memory types (data, thread stack, shared
memory) to have different page sizes, and unfortunately this is
tweakable by adventurous sysadmins. So we have to live with a number
of possible combinations (e.g. data segment with 4K, shared memory aka
java heap 64K). If we can we try to have real 64K pages everywhere.
But the VM still has to come up if the admin globally tweaked some
these page settings (some combinations we disallow - you have to draw
a line somewhere).
In the beginning our AIX port only handled 4K pages. ~ 10 years ago I
wanted to add 64K page support. Then I had the problem that I had a
multi-page-sized OS on one side and a shared code base which assumed
"we have only one page size and it is os::vm_page_size()" on the other
The result is a weird compromise: Toward the hotspot I act as if the
underlying OS had only one page size, and since in reality I have to
deal with a mix of 4K and 64K paged memory regions, sometimes I lie
and act as if everything were 64K paged. On most Unices this would not
work but on AIX it does for a number of idiosyncratic reasons (e.g.
automatic commit of memory - you do not have to commit explicitly).
Only from time to time there are small cracks where the truth shines
through, and one of these cracks is the mincore() call.
If you are interested in more sordid details:
I wasted so much of my life with this stuff :)
> If uses mincore to walk a memory
> space, it can potential report bit vector in different page sizes? which
> sounds contradicting to AIX mincore API document ?
I did ask this myself, and I made tests and micore always acts as if
the underlying memory were 4K paged. Which is also according to the
standard, which claims mincore uses sysconf(_SC_PAGESIZE), which
returns 4K on AIX.
Thanks for the review,
>> Since JDK-8204552 was added to deal with the first point, it makes
>> sense to wait until that issue is finished. Until then, I'd like to
>> disable thread stack recognition in NMT for AIX. We can revisit this
>> topic again once we can spare time for it and JDK-8204552 has been
More information about the hotspot-runtime-dev