SMP JNI issue, UseMembar workaround resolves it
David.Holmes at oracle.com
Sat Jun 11 00:04:16 PDT 2011
I'll try and take a deeper look at this but note that if a safepoint is
pending the thread is supposed to "crash" in
write_memory_serialize_page. The SEGV so generated should be handled by
the VM and take the thread to the safepoint. It seems the signal is not
being handled correctly. UseMembar will workaround this by not using the
If you observed this SEGV under gdb then it may be a red-herring as gdb
is stopping the VM from handling the SEGV when it is actually an
When the real crash occurs what exactly gets reported?
Scott Valentine said the following on 06/11/11 15:58:
> We ran into an issue where our application would consistently crash with a
> segmentation violation after roughly 15 minutes to 90 minutes of runtime.
> It's not exactly a bug, but I thought it would be helpful to post the
> information here for other folks, and to hopefully support the great work
> of OpenJDK developers down the road.
> The quick details are that we consistently die without much error detail
> (just a simple segmentation violation printout) when our code enters JNI,
> does some stuff, and then calls back into the VM. The JNI_ENTRY fails when
> calling transition_from_native.
> The client application is running on an Asus Aspire-One netbook (Atom
> N270, dual core @800MHz) with OpenJDK-1.6.0-20-1.9.7. A gdb stack trace
> and jstack dump is attached for details on what is happening. More details
> on the system structure are included below for those interested, but
> basically it is a moderately threaded, intensively JNI application running
> under the Equinox OSGi runtime.
> It was a little tough to debug, as the clients are remote and I have to go
> through multiple ssh back-doors. We initially suspected our JNI
> middleware, but after getting the necessary debugging symbols, tools, and
> builds in place, we found that it was always crashing on the
> write_memory_serialize_page call when attempting JNI_ENTRY after spending
> some time in the native code. It never even got to the point of reference
> values like the VM env, jobject, etc. Anyhow, the source for the
> transition_from_native call led us to try the -X:+UseMembar option which
> seems to have resolved the issue.
> Anyhow, I hope the trace info is helpful, and please let me know if I can
> provide more info. I can't spare a ton of cycles, but I would be happy to
> contribute as time permits.
> Here are the application details:
> As mentioned previously, the application is running in the Equinox OSGi
> framework, and it relies heavily on two JNI libraries: the RXTX library
> (2.1-7r2), and a middleware called opensplice DDS (5.4.1). Opensplice is a
> shared memory model runtime that runs as three seperate processes, and has
> a JNI interface into the framework. The application has two serial devices
> (two RXTX threads), and we have a thread for each (two more threads) that
> does blocking reads on those ports. These threads put data into a
> BlockingQueue, which has another thread that takes data from the queue and
> processes it (two more threads). These threads process the data, make JNI
> calls into the DDS middleware (this is where the failures have, at least
> so far, always occurred), and put some information into another Blocking
> Queue. There are two other application threads (total of eight now). The
> first periodically writes to one of the serial port. The other thread
> handles the second blocking Queue and also makes JNI calls into the DDS
> middleware. Overall, there are three threads calling into that middleware
> I think there are something like 20 threads total, but three are the JVM
> threads, and 7 or so are related to Equinox and our launcher that don't
> really do anything unless the system is starting or stopping or doing
> something in the OSGi world.
> Thanks, and again, I hope this info can be helpfult to others.
> Scott Valentine
> Concentris Systems LLC
> Manoa Innovation Center, Suite #238
> 2800 Woodlawn Drive
> Honolulu, HI 96822
> (808) 988-6100
More information about the hotspot-dev