Request for review (s): 7173959 : Jvm crashed during coherence exabus (tmb) testing

Bengt Rutisson bengt.rutisson at
Thu Dec 13 04:58:25 PST 2012

Hi David,

Updated webrev:

I moved the alignment of "size" back into ReservedSpace::initialize() 
since we actually store the size in an instance variable (_size), so I 
think it is a bit risky to just do the align_up in 
os::reserve_memory_aligned(). We probably want the instance variables 
_size and _alignment in ReservedSpace to be consistent.

I added an assert in os::reserve_memory_aligned() to verify that the 
size is correctly aligned. I also added the assert you suggested to 
check that alignment is page size aligned.


On 12/13/12 11:50 AM, David Holmes wrote:
> On 13/12/2012 8:37 PM, Bengt Rutisson wrote:
>> Hi again,
>> Updated webrev:
>> I removed the comment and the alignment.
>> But I did not add an assert just yet.
>> At the top of ReservedSpace::initialize() we have this code:
>> const size_t granularity = os::vm_allocation_granularity();
>> assert((size & (granularity - 1)) == 0,
>> "size not aligned to os::vm_allocation_granularity()");
>> assert((alignment & (granularity - 1)) == 0,
>> "alignment not aligned to os::vm_allocation_granularity()");
>> Where os::vm_allocation_granularity() is implemented as page size on all
>> platforms except Windows. There we look it up from the Windows API. I
>> assume that is a page size too, but since the Windows code in our patch
>> does not unmap based on the alignment it should be safe either way.
>> Is this assert enough or would you like me to add an assert closer to
>> the code block were I did the changes?
> As this is a separate method now I think an assert in the method 
> itself would not go astray.
> David
> -----
>> Thanks,
>> Bengt
>> On 12/13/12 11:02 AM, Bengt Rutisson wrote:
>>> Hi David,
>>> Thanks for looking at this!
>>> On 12/13/12 8:33 AM, David Holmes wrote:
>>>> Hi Bengt,
>>>> This has to be one of the absolute best review requests I've ever
>>>> read :) Thank you.
>>> Wow! Thanks! :)
>>>> Just a couple of queries.
>>>> os_posix.cpp:
>>>> Love the diagram :)
>>> It was Mikael's way of making sure we got it right.
>>>> I'm assuming that "alignment" has to be >= the underlying page size,
>>>> and in fact must be a multiple of the underlying page size ? (As I
>>>> assume you can only unmap whole numbers of pages). If so does that
>>>> need to be checked somewhere?
>>> Good point. I'll add an assert to check that.
>>>> In virtualSpace.cpp:
>>>> // Reserve size large enough to do manual alignment and
>>>> // increase size to a multiple of the desired alignment
>>>> size = align_size_up(size, alignment);
>>>> ! base = os::reserve_memory_aligned(size, alignment);
>>>> The comment doesn't seem necessary now, and the align_size_up seems
>>>> redundant.
>>> You are right. I'll remove the comment and the alignment.
>>> Thanks,
>>> Bengt
>>>> Thanks,
>>>> David
>>>> On 13/12/2012 4:42 PM, Bengt Rutisson wrote:
>>>>> Hi all,
>>>>> Could I have a couple of reviews for this change?
>>>>> This is for a bug originally reported by the Coherence team:
>>>>> 7173959 : Jvm crashed during coherence exabus (tmb) testing
>>>>> The original analysis and proposed fix was done by Mikael Gerdin 
>>>>> and me
>>>>> together. I'll handle the webrev and push since Mikael is on vacation
>>>>> starting today. But Mikael did a great job tracking down this very
>>>>> difficult bug, so he should have a large part of the credit for this
>>>>> bug
>>>>> fix,
>>>>> Description from the CR:
>>>>> The reason that we crash is due to how we re-map memory when we 
>>>>> want to
>>>>> align it for large pages in ReservedSpace::initialize().
>>>>> Here is what happens:
>>>>> The reservation of memory is split up to a few steps. When we want a
>>>>> chunk of memory with large pages we first just reserve some memory
>>>>> large
>>>>> enough for what we need. Then we realize that we want large pages,
>>>>> so we
>>>>> want to re-map the reserved memory to use large pages. But since this
>>>>> requires that we have a large-page-aligned memory chunk we may 
>>>>> have to
>>>>> fix the recently reserved memory chunk up a bit.
>>>>> We do this in ReservedSpace::initialize() by first releasing the 
>>>>> memory
>>>>> we just reserved. Then requesting more memory than we actually 
>>>>> need to
>>>>> make sure that we have enough space to do manual large page 
>>>>> alignment.
>>>>> After we have gotten this memory we figure out a nicely aligned start
>>>>> address. We then release the memory again and then reserve just 
>>>>> enough
>>>>> memory but using the aligned start address as a fixed address to make
>>>>> sure that we get the memory we wanted in an aligned way.
>>>>> This is done in a loop to make sure that we eventually get some 
>>>>> memory.
>>>>> The interesting code looks like this:
>>>>> do {
>>>>> char* extra_base = os::reserve_memory(extra_size, NULL, alignment);
>>>>> if (extra_base == NULL) return;
>>>>> // Do manual alignement
>>>>> base = (char*) align_size_up((uintptr_t) extra_base, alignment);
>>>>> assert(base >= extra_base, "just checking");
>>>>> // Re-reserve the region at the aligned base address.
>>>>> os::release_memory(extra_base, extra_size); // (1)
>>>>> base = os::reserve_memory(size, base); // (2)
>>>>> } while (base == NULL);
>>>>> There is a race here between releasing the memory in (1) and
>>>>> re-reserving it in (2). But the loop is supposed to handle this race.
>>>>> The problem is that on posix platforms you can remap the same memory
>>>>> area several times. The call in (2) will use mmap with MAP_FIXED. 
>>>>> This
>>>>> means that the OS will think that you know exactly what you are 
>>>>> doing.
>>>>> So, if part of the memory has been mapped already by the process it
>>>>> will
>>>>> just go ahead and reuse that memory.
>>>>> This means that if we are having multiple threads that do mmap. We 
>>>>> can
>>>>> end up with a situation where we release our mapping in (1). Then
>>>>> another thread comes in and maps part of the memory that we used to
>>>>> have. Then we remap over that memory again in (2) with MAP_FIXED.
>>>>> Now we
>>>>> have a situation where two threads in our process have mapped the 
>>>>> same
>>>>> memory. If both threads try to use it or if one of the threads unmap
>>>>> part or all of the memory we will crash.
>>>>> On posix it is possible to unmap any part of a mapped chunk. So, our
>>>>> proposed solution to the race described above is to not unmap all
>>>>> memory
>>>>> in (1) but rather just unmap the section at the start and at the 
>>>>> end of
>>>>> the chunk that we mapped to get alignment. This also removes the need
>>>>> for the loop.
>>>>> However, on Windows you can only unmap _all_ of the memory that you
>>>>> have
>>>>> mapped. On the other hand Windows also will not allow you to map over
>>>>> other mappings, so the original code is actually safe. If we keep
>>>>> the loop.
>>>>> So, our solution is to treat this differently on Windows and posix
>>>>> platforms.
>>>>> Thanks,
>>>>> Bengt

More information about the hotspot-runtime-dev mailing list