RFR: JDK-8148992: VM can hang on exit if root region scanning is initiated but not executed

Bengt Rutisson bengt.rutisson at oracle.com
Mon Feb 8 14:14:13 UTC 2016


Hi again,

Based on some internal feedback from Thomas I have an updated webrev:
http://cr.openjdk.java.net/~brutisso/8148992/webrev.02/

Changes are:

- Changed the assert I added to only apply if we would actually do any 
root region scanning
- Move the notification on the root region lock into a private helper 
method that could be used in both places where this code was duplicated.
- Removed an obsolete comment in g1CollectedHeap.cpp

Here's a diff compared to the first version:
http://cr.openjdk.java.net/~brutisso/8148992/webrev.00-02.diff/

(The comment change is only in the full webrev I didn't get it in to the 
diff.)

Thanks,
Bengt


On 2016-02-08 11:15, Bengt Rutisson wrote:
>
> Hi all,
>
> Could I have a couple of reviews for this change?
>
> http://cr.openjdk.java.net/~brutisso/8148992/webrev.00
> https://bugs.openjdk.java.net/browse/JDK-8148992
>
> There are some more details in the bug report, but here's the most 
> relevant text:
>
> The reason for the hang is that during shutdown we don't check the 
> root region scanning.
>
> The ConcurrentMark loop starts like this:
>
>   while (!_should_terminate) {
>     // wait until started is set.
>     sleepBeforeNextCycle();
>     if (_should_terminate) {
>       break;
>     }
>
> If _should_terminate is true we just exit without notifying any 
> waiters on the root region lock. If a GC happens during shutdown the 
> GC will hang waiting for the root region scanning to finish but the 
> ConcurrentMark thread has just exited and will not do any root region 
> scanning.
>
> I can trigger this behavior by adding a sleep in the above code:
>
>   while (!_should_terminate) {
>     // wait until started is set.
>     sleepBeforeNextCycle();
>     if (_should_terminate) {
>       for (int i = 0; i < 10; i++) {
>         os::naked_short_sleep(999);
>       }
>       break;
>     }
>
> and running this small java program:
>
> import java.util.LinkedList;
>
> public class Repro2 {
>
>     public static LinkedList<byte[]> dummyStore = new LinkedList<>();
>
>     public static void main(String[] args) throws Exception {
>         System.out.println("Started");
>         for (int i = 0; i < 1024*16; i++) {
>             dummyStore.add(new byte[1024]);
>         }
>         System.out.println("Triggered one YC");
>
>         Thread thread = new Thread(()->System.exit(0));
>         thread.start();
>         Thread.sleep(100);
>
>         for (int i = 0; i < 1024*16; i++) {
>             dummyStore.add(new byte[1024]);
>         }
>         System.out.println("Triggered Initial mark");
>
>         System.gc(); // Full GG
>
>         System.out.println("Done.");
>     }
> }
>
>
> Running with the sleep added and the following command line:
>
> java -Xmx16m -Xmx64m -XX:InitiatingHeapOccupancyPercent=0 Repro2
>
> makes the VM hang every time on my workstation.
>
> If I add a "cancel_scan()" method and call it before the 
> ConcurrentMark thread is giving up, the VM does not hang anymore. That 
> is, running with this code makes the VM sleep a while during shutdown 
> but it does not hang:
>
>
>   while (!_should_terminate) {
>     // wait until started is set.
>     sleepBeforeNextCycle();
>     if (_should_terminate) {
>       for (int i = 0; i < 10; i++) {
>         os::naked_short_sleep(999);
>       }
>       _cm->root_regions()->cancel_scan();
>       break;
>     }



More information about the hotspot-gc-dev mailing list