RFR: 8006423 SA: NullPointerException in sun.jvm.hotspot.debugger.bsd.BsdThread.getContext(BsdThread.java:67)
staffan.larsen at oracle.com
Thu Jan 17 11:48:04 PST 2013
This is a request for review of a fix for SA on OS X.
The bug report contains a detailed description of the problem and the proposed solution. I have copied some of that text below. To verify the fix I have manually run jstack on OS X.
In many cases when running the SA version of JStack (or other SA tools) an NPE is thrown in BsdThread.getContext(). The underlaying cause is that SA fails to read the context of the thread in the native method getThreadIntegerRegisterSet0() (thread_get_state returns an error).
The following is my understanding of what the cause is and a suggestion for a fix - my experience with OS X is a bit limited so I may be off on some details.
thread_get_state() takes a thread_t as a parameter. The value of this parameter comes from SA reading the value of the OSThread._thread_id field in the Hotspot process being debugged. This value is set in HotSpot to ::mach_thread_self() which is documented as "The mach_thread_self system call returns the calling thread's thread port."
My theory is that this "thread port" in not valid when a different process calls thread_get_state(). Instead, the other process (SA in this case) needs it's own "thread port" for the thread it wants to access.
There is a way to list all the thread ports in a different process (or "task" as they are called in Mach) via the task_threads() function.
So now we have the thread ports, we just need to correlate them with the C++ Thread objects in the Hotspot process. One way to do this correlation is via the stack pointer. We can get the current value of the stack pointer (rsp) in SA and look through all the Thread objects to see which one the stack pointer belongs to (by looking at Thread._stack_base and Thread._stack_size).
Another way seems to be to use the thread_info() function with the THREAD_IDENTIFIER_INFO parameter. This gives us a struct which has a field called thread_id. The comment for this field in the thread_info.h file says "system-wide unique 64-bit thread id". The value for this thread_id is the same when called from Hotspot and when called from the debugging process (SA), so this looks like a way to do the correlation.
This requires Hotspot to store this value in OSThread and SA to first list all the "thread ports", then find the thread_id for each one and select the right "thread port" for the thread we are looking for.
Using a thread_id provided by the system seems more reliable than using the stack pointer for correlation.
More information about the hotspot-runtime-dev