[8u] RFR: 8144483: One long Safepoint pause directly after each GC log rotation

Rainer Jung rainer.jung at kippdata.de
Thu Jan 7 03:43:04 UTC 2016

Hi all,

I'm the original reporter of the problem.

Am 06.01.2016 um 17:18 schrieb Yumin Qi:
> Cheleswer,
>    If the call to check_addr0() is unnecessary, remove it is a better
> choice I think. In fatal error, means VM will exit, the error handler
> will print memory info, so it is redundant here. I don't know why it is
> added to log rotation code in the first place.

When I first investigated the problem, I noticed that the same function 
check_addr0() is also used in os::run_periodic_checks(). The whole code 
goes back to


I also wondered why it was called during GC log rotation but didn't 
investigate further.

Note that in addition to the problem of the long safepoint pause during 
GC log file rotation, this bug has a second aspect, namely that reading 
the maps file is done in a very inefficient and slow way. So even after 
removing the call to check_addr0() during gc log file rotation - which 
would fix this bug here - whatever code remains to call check_addr0(), 
will still notice that the call takes a very long time (e.g. more than 
30 seconds if your process has lots of threads and open jar files).

This aspect was triggered by an older fix of the code which reads the 
map file. Before the fix, the file was closed after reading the first 
record from it. That was a bug but it terminated the read loop early and 
the call returned quickly. After the old fix, the close was moved behind 
the read loop and now the full read loop executed and took too long. As 
an example for the old fix see


I think the correct (and fast) way of reading the maps file can be seen 
by looking at the code of the Solaris "pmap" command. It reads the file 
in one go using the pread() call instead of reading lots of structs one 
by one. Therefore pmap executes very fast, much faster than 
check_addr0() for the same maps file. I did check that on the same map 
file, that showed the slowness bug in check_addr0().

Since the pmap code isn't available in the open, I looked at the Illumos 
code based on older OpenSolaris. There one can see:


- calls Pmapping_iter_resolved() to iterate over the maps


- Pmapping_iter_resolved() calls i_Pmapping_iter()
- i_Pmapping_iter() calls Prd_agent()
- Prd_agent() calls Pupdate_maps()
- Pupdate_maps() calls Preadmaps()
- Preadmaps() calls P->ops.pop_read_maps()


- P->ops.pop_read_maps is Pread_maps_live()
- Pread_maps_live() (starting at line 100) uses pread() to read the 
whole file in one go into a buffer, that was sized using the st_size 
field in the stat returned from fstat() called on the map file.

So I suggest you also fix the slowness in check_addr0() by adjusting the 
code that reads the map file in a similar way. If you think that's to 
dangerous for a huge file, then you might consider at least reading the 
file in big enough chunks (e.g. 100 structs at a time).



> On 1/6/2016 3:37 AM, cheleswer sahu wrote:
>> Hi,
>> Please review the code changes for
>> "https://bugs.openjdk.java.net/browse/JDK-8144483".
>> webrev link: http://cr.openjdk.java.net/~kevinw/8144483/webrev.00/
>> JBS link: https://bugs.openjdk.java.net/browse/JDK-8144483
>> Bug brief:
>> A long pause is observed after each gc log rotation in Solaris.
>> Problem Identified:
>> In each GC log rotation "print_memory_info()" is getting called
>> through dump_loggc_header().
>> "print_memory_info()" of solaris version calls check_addr0(st);
>> File:
>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/80959a760b85/src/os/solaris/vm/os_solaris.cpp
>> void os::print_memory_info(outputStream* st) {
>>   st->print("Memory:");
>>   st->print(" %dk page", os::vm_page_size()>>10);
>>   st->print(", physical " UINT64_FORMAT "k", os::physical_memory()>>10);
>>   st->print("(" UINT64_FORMAT "k free)", os::available_memory() >> 10);
>>   st->cr();
>>  (void) check_addr0(st);
>> }
>> Now check_addr0(st) function do a lot of read call to read the data
>> from /proc/self/map.
>> and check if virtual address is mapped to 0x0. These read calls take
>> lot of time which results in GC rotation pause.
>> Here calling check_addr0() seems unnecessary for every log rotation.
>> It will be more logical if this function gets called only
>> when an error is reported.
>> Solution proposed:
>> Before GC log rotation print_memory_info() is ever getting called from
>> Vm_error.cpp during error reporting. And in case of error reporting
>> checking for address mapping to 0x0 looks fine. So the proposed
>> solution is to do an extra check inside print_memory_info().
>> -  (void) check_addr0(st);
>> +  if (VMError::fatal_error_in_progress()){
>> +     (void) check_addr0(st);
>> +  }
>>  }
>>  This check doesn't fit well inside printing function, but at this
>> point I don't see the need to create a new os:: method and change all
>> the OS classes just for that check.
>> Regards,
>> Cheleswer

More information about the hotspot-runtime-dev mailing list