RFR: JDK-8148992: VM can hang on exit if root region scanning is initiated but not executed

Jesper Wilhelmsson jesper.wilhelmsson at oracle.com
Mon Feb 8 14:50:11 UTC 2016


Looks good!
/Jesper

Den 8/2/16 kl. 15:14, skrev Bengt Rutisson:
>
> Hi again,
>
> Based on some internal feedback from Thomas I have an updated webrev:
> http://cr.openjdk.java.net/~brutisso/8148992/webrev.02/
>
> Changes are:
>
> - Changed the assert I added to only apply if we would actually do any root
> region scanning
> - Move the notification on the root region lock into a private helper method
> that could be used in both places where this code was duplicated.
> - Removed an obsolete comment in g1CollectedHeap.cpp
>
> Here's a diff compared to the first version:
> http://cr.openjdk.java.net/~brutisso/8148992/webrev.00-02.diff/
>
> (The comment change is only in the full webrev I didn't get it in to the diff.)
>
> Thanks,
> Bengt
>
>
> On 2016-02-08 11:15, Bengt Rutisson wrote:
>>
>> Hi all,
>>
>> Could I have a couple of reviews for this change?
>>
>> http://cr.openjdk.java.net/~brutisso/8148992/webrev.00
>> https://bugs.openjdk.java.net/browse/JDK-8148992
>>
>> There are some more details in the bug report, but here's the most relevant text:
>>
>> The reason for the hang is that during shutdown we don't check the root region
>> scanning.
>>
>> The ConcurrentMark loop starts like this:
>>
>>   while (!_should_terminate) {
>>     // wait until started is set.
>>     sleepBeforeNextCycle();
>>     if (_should_terminate) {
>>       break;
>>     }
>>
>> If _should_terminate is true we just exit without notifying any waiters on the
>> root region lock. If a GC happens during shutdown the GC will hang waiting for
>> the root region scanning to finish but the ConcurrentMark thread has just
>> exited and will not do any root region scanning.
>>
>> I can trigger this behavior by adding a sleep in the above code:
>>
>>   while (!_should_terminate) {
>>     // wait until started is set.
>>     sleepBeforeNextCycle();
>>     if (_should_terminate) {
>>       for (int i = 0; i < 10; i++) {
>>         os::naked_short_sleep(999);
>>       }
>>       break;
>>     }
>>
>> and running this small java program:
>>
>> import java.util.LinkedList;
>>
>> public class Repro2 {
>>
>>     public static LinkedList<byte[]> dummyStore = new LinkedList<>();
>>
>>     public static void main(String[] args) throws Exception {
>>         System.out.println("Started");
>>         for (int i = 0; i < 1024*16; i++) {
>>             dummyStore.add(new byte[1024]);
>>         }
>>         System.out.println("Triggered one YC");
>>
>>         Thread thread = new Thread(()->System.exit(0));
>>         thread.start();
>>         Thread.sleep(100);
>>
>>         for (int i = 0; i < 1024*16; i++) {
>>             dummyStore.add(new byte[1024]);
>>         }
>>         System.out.println("Triggered Initial mark");
>>
>>         System.gc(); // Full GG
>>
>>         System.out.println("Done.");
>>     }
>> }
>>
>>
>> Running with the sleep added and the following command line:
>>
>> java -Xmx16m -Xmx64m -XX:InitiatingHeapOccupancyPercent=0 Repro2
>>
>> makes the VM hang every time on my workstation.
>>
>> If I add a "cancel_scan()" method and call it before the ConcurrentMark thread
>> is giving up, the VM does not hang anymore. That is, running with this code
>> makes the VM sleep a while during shutdown but it does not hang:
>>
>>
>>   while (!_should_terminate) {
>>     // wait until started is set.
>>     sleepBeforeNextCycle();
>>     if (_should_terminate) {
>>       for (int i = 0; i < 10; i++) {
>>         os::naked_short_sleep(999);
>>       }
>>       _cm->root_regions()->cancel_scan();
>>       break;
>>     }
>


More information about the hotspot-gc-dev mailing list