JVM hangs beyond recovery
Y. Srinivas Ramakrishna
y.s.ramakrishna at oracle.com
Fri Jun 18 18:19:22 PDT 2010
Hi David, Stas --
One would expect to see some other thread inside a libdl.so method
that is not waiting on the lock and which is calling back into the
JVM (and suspended), right? Such a thread (as David said) appears not
to be present. I'd suggest that someone look at the dll mutex to
see who the holder is (or, as Paul conjectured a memory stomp is
blowing out its state making it look like its locked).
On Solaris, there's the LD_DEBUG option which might have helped
shed some light. One might try truss (or Linux's strace?) to
trace calls into libdl interfaces to see who might be taking
out the lock (or set a watchpoint on the lock?).
Are you running a standard version of the kernel or a non-standard
one? (not that we can help with that; but if it's some non-standard
kernel, you might want to talk with the appropriate linux-kernel
alias to help find the right tools; strace should begin to shed
some light, I am thinking.)
Stas Oskin wrote:
>> I can't say if this fixes it or not as I don't know how all the code hooks
>> together. But I'm somewhat surprised that this is related to JNI_ONload as
>> from what I saw the problem is caused by a hook executed by the dlopen while
>> the internal dl lock is held - I would not think that JNI_Onload could be
>> executed while inside dlopen.
>> David Holmes
> Thanks for the analysis.
> The fix indeed didn't prevent the deadlock, but while examing the file, I
> didn't see there any dlopen calls, only dlsym.
> Does this fact stll acceptable within your current thinking of the deadlock
> cause, or you see another possible reason in the log file?
> Thanks again.
More information about the hotspot-runtime-dev