Code review request: three native memory tracking related bugs

Zhengyu Gu at
Tue Jul 17 06:40:44 PDT 2012
> src/share/vm/services/memSnapshot.cpp
> I don't understand your memory-management strategy in the MemSnapshot 
> code. In the constructor you don't check if any of the allocations are 
> successful. You check for null and bypass the action in some places ( 
> eg merge() wrt _staging_area), but assert that it is not null in 
> others (eg. promote()), and don't check at all in others (eg 
> print_snapshot_stats()). It is far from clear if those latter 
> functions will never be reached if some of those values are null. Also 
> if _lock is null, you will simply not lock! - which doesn't seem right.
As a requirement, NMT is a secondary functionality, which means that it 
should not cause VM to exit when it encounters error.

In MemTracker::start() method, it uses 'nothrow new' to instantiate 
MemSnapshot and performs out of memory checking, which checks 
MemSnapshot object and all allocations inside constructor (snapshot == 
NULL || snapshot->out_of_memory()), if OOM is encountered, it shuts down 
NMT. In short, when NMT state >= started, snapshot lock != NULL.

merge() merges recorders into _staging_area and _staging_area is 
expandable, so OOM checking is needed.  promote() promotes _staging_area 
into snapshot, can only be reached when staging phase is successfully 
completed, and _staging_area is read-only during promotion, that's why 
OOM checking on _staging_area is not necessary, while any insertion 
failures to snapshot indicates OOM condition, that also results shutting 
down NMT.

Again, if _lock == null, NMT can never transition to started state, so 
locking can never happen.

> It is difficult to gauge the full implications of using 
> ThreadCritical. I would not be surprised is there is a potential 
> problem with assertion failures while holding the ThreadCritical. But 
> this is little different to other debug option combinations that can 
> cause secondary failures/hangs during error reporting.
ThreadCritical is also used to block none JavaThread, if it is the case, 
we have to rethink what lock is proper.



> David
> -----
>> Thanks,
>> -Zhengyu
>> On 7/13/2012 3:43 PM, Zhengyu Gu wrote:
>>> 7181989: NMT ON: Assertion failure when NMT checks thread's native
>>> CR:
>>> Webrev:
>>> We try to assert Thread's stack base to ensure that
>>> Thread::record_stack_base_and_size() to record native stack to NMT,
>>> but there is scenario that the thread fails to start, which no native
>>> stack is created and should not be asserted.
>>> 7181986: NMT ON: Assertion failure when running jdi
>>> ExpiredRequestDeletionTest
>>> CR:
>>> Webrev:
>>> This is a racing condition when C/C++ runtime exit handler is ran
>>> before NMT worker thread exits. The exit handler calls _query_lock's
>>> destructor while NMT worker thread is still holding it. The fix is to
>>> make _query_lock a heap object, instead of static object, but the
>>> drawback is that, it does not seem that _query_lock can be safely
>>> deleted.
>>> Also, I reassigned MemSnaphot lock and query lock, so they participate
>>> in the deadlock detection logic.
>>> 7182543: NMT ON: Aggregate a few NMT related bugs
>>> CR
>>> Webrev:
>>> - Fixed generations_in_used calculation
>>> - Wait MemRecorder instance count to drop to zero before completely
>>> shutdown NMT
>>> - Added assertion for JavaThread in _thread_blocked state.
>>> Thanks,
>>> -Zhengyu

More information about the hotspot-dev mailing list