Request for review (s): 7173959 : Jvm crashed during coherence exabus (tmb) testing

Bengt Rutisson bengt.rutisson at
Thu Dec 13 02:02:21 PST 2012

Hi David,

Thanks for looking at this!

On 12/13/12 8:33 AM, David Holmes wrote:
> Hi Bengt,
> This has to be one of the absolute best review requests I've ever read 
> :) Thank you.

Wow! Thanks! :)

> Just a couple of queries.
> os_posix.cpp:
> Love the diagram :)

It was Mikael's way of making sure we got it right.

> I'm assuming that "alignment" has to be >= the underlying page size, 
> and in fact must be a multiple of the underlying page size ? (As I 
> assume you can only unmap whole numbers of pages). If so does that 
> need to be checked somewhere?

Good point. I'll add an assert to check that.

> In virtualSpace.cpp:
>         // Reserve size large enough to do manual alignment and
>         // increase size to a multiple of the desired alignment
>         size = align_size_up(size, alignment);
> !       base = os::reserve_memory_aligned(size, alignment);
> The comment doesn't seem necessary now, and the align_size_up seems 
> redundant.

You are right. I'll remove the comment and the alignment.


> Thanks,
> David
> On 13/12/2012 4:42 PM, Bengt Rutisson wrote:
>> Hi all,
>> Could I have a couple of reviews for this change?
>> This is for a bug originally reported by the Coherence team:
>> 7173959 : Jvm crashed during coherence exabus (tmb) testing
>> The original analysis and proposed fix was done by Mikael Gerdin and me
>> together. I'll handle the webrev and push since Mikael is on vacation
>> starting today. But Mikael did a great job tracking down this very
>> difficult bug, so he should have a large part of the credit for this bug
>> fix,
>> Description from the CR:
>> The reason that we crash is due to how we re-map memory when we want to
>> align it for large pages in ReservedSpace::initialize().
>> Here is what happens:
>> The reservation of memory is split up to a few steps. When we want a
>> chunk of memory with large pages we first just reserve some memory large
>> enough for what we need. Then we realize that we want large pages, so we
>> want to re-map the reserved memory to use large pages. But since this
>> requires that we have a large-page-aligned memory chunk we may have to
>> fix the recently reserved memory chunk up a bit.
>> We do this in ReservedSpace::initialize() by first releasing the memory
>> we just reserved. Then requesting more memory than we actually need to
>> make sure that we have enough space to do manual large page alignment.
>> After we have gotten this memory we figure out a nicely aligned start
>> address. We then release the memory again and then reserve just enough
>> memory but using the aligned start address as a fixed address to make
>> sure that we get the memory we wanted in an aligned way.
>> This is done in a loop to make sure that we eventually get some memory.
>> The interesting code looks like this:
>> do {
>> char* extra_base = os::reserve_memory(extra_size, NULL, alignment);
>> if (extra_base == NULL) return;
>> // Do manual alignement
>> base = (char*) align_size_up((uintptr_t) extra_base, alignment);
>> assert(base >= extra_base, "just checking");
>> // Re-reserve the region at the aligned base address.
>> os::release_memory(extra_base, extra_size); // (1)
>> base = os::reserve_memory(size, base); // (2)
>> } while (base == NULL);
>> There is a race here between releasing the memory in (1) and
>> re-reserving it in (2). But the loop is supposed to handle this race.
>> The problem is that on posix platforms you can remap the same memory
>> area several times. The call in (2) will use mmap with MAP_FIXED. This
>> means that the OS will think that you know exactly what you are doing.
>> So, if part of the memory has been mapped already by the process it will
>> just go ahead and reuse that memory.
>> This means that if we are having multiple threads that do mmap. We can
>> end up with a situation where we release our mapping in (1). Then
>> another thread comes in and maps part of the memory that we used to
>> have. Then we remap over that memory again in (2) with MAP_FIXED. Now we
>> have a situation where two threads in our process have mapped the same
>> memory. If both threads try to use it or if one of the threads unmap
>> part or all of the memory we will crash.
>> On posix it is possible to unmap any part of a mapped chunk. So, our
>> proposed solution to the race described above is to not unmap all memory
>> in (1) but rather just unmap the section at the start and at the end of
>> the chunk that we mapped to get alignment. This also removes the need
>> for the loop.
>> However, on Windows you can only unmap _all_ of the memory that you have
>> mapped. On the other hand Windows also will not allow you to map over
>> other mappings, so the original code is actually safe. If we keep the 
>> loop.
>> So, our solution is to treat this differently on Windows and posix
>> platforms.
>> Thanks,
>> Bengt

More information about the hotspot-runtime-dev mailing list