RFR (S) 8181143: Introduce diagnostic flag to abort VM on too long VM operations

Aleksey Shipilev shade at redhat.com
Fri Nov 16 16:30:34 UTC 2018



SafepointTimeout is nice to discover long/stuck safepoint syncs. But it is as important to discover
long/stuck VM operations. This patch complements the timeout machinery with tracking VM operation
themselves. Among other things, this allows to terminate the VM when very long VM operation is
blocking progress. High-availability users would enjoy fail-fast JVM -- in fact, the original
prototype was done as request from Apache Ignite developers.

Example with -XX:+VMOperationTimeout -XX:VMOperationTimeoutDelay=100 -XX:+AbortVMOnVMOperationTimeout:

[3.117s][info][gc,start] GC(2) Pause Young (Normal) (G1 Evacuation Pause)
[3.224s][warning][vmthread] VM Operation G1CollectForAllocation took longer than 100 ms
# A fatal error has been detected by the Java Runtime Environment:
#  Internal Error (/home/sh/jdk-jdk/src/hotspot/share/runtime/vmThread.cpp:218), pid=2536, tid=2554
#  fatal error: VM Operation G1CollectForAllocation took longer than 100 ms

Testing: hotspot/tier1, ad-hoc tests, jdk-submit (pending)


More information about the hotspot-dev mailing list