RFR(M): 8210754: print_location is not reliable enough (printing register info)

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Tue Sep 25 19:08:59 UTC 2018

Hi, I am sorry for the delay getting to this.


Can you assert is_error_reported() and add a comment that this is only 
to be called during error reporting?  I share David's concern about 
walking CLDG graph without a lock but I think the only place where we 
unlink items is limited and the timing window where you'll get into 
trouble is very limited.


I liked that that the guts of print_location() was moved ot 
vmErrorHelper but not sure anymore.

I think these functions belong in their appropriate classes:

  105 bool vmErrorHelper::is_valid_symbol(Symbol* s) {

This is actually wrong. The Symbol is not in metaspace.  I agree with 
Thomas's suggestion of testing validity and moving to symbol.hpp/cpp.

  122   return is_readable_range(bytes, bytes + len);

I kind of wonder if this is going too far.   Why not just ask 

  125 bool vmErrorHelper::is_valid_klass(Klass* k) {

I think this belongs in Klass also.  It's odd that is_valid_method is in 
Method but is_valid_klass is somewhere else.


+ } else if (Method::has_method_vptr((const void*)addr)) {

This should be changed to is_valid_method().  is_valid_klass() should be 
written similarly.

Isn't if (Metaspace::contains_non_shared(m))  a sufficient check for the 
Klass pointer, ie, like the code in Method::is_valid_method()?

  141 bool vmErrorHelper::is_valid_oop(oop obj) {

I expected is_valid_oop to be the top level verification and to load the 
klass pointer and ask is_valid_klass.  verify_oop does that. Maybe the 
CLDG iteration is warranted to see if the klass is still loaded.   I 
think using just using Universe::heap()->is_in_reserved() should be 
sufficient and you should avoid using block_start() and other things 
that might be broken with a crash.

  162   // Try to find addr using block_start.
  163   HeapWord* p = Universe::heap()->block_start(addr);

Maybe have is_valid_oop(oop o) in oop.cpp do some stuff for error 
reporting then call is_oop_or_null().

I've always wanted the code blob printing code to be in codeBlob.cpp.

So I'd rather this code be distributed to the classes to implement the 
validity checks, so that similar code can be accounted for.

I appreciate the hardening of error reporting.  It's long overdue!


On 9/25/18 12:52 PM, Doerr, Martin wrote:
> Hi Thomas,
> thanks for the detailed review and looking again at my change.
> Making the checks DEBUG_ONLY is indeed not a good idea. We want better hs_err files in product.
> I'd like to wait for other opinions before doing large renaming or refactoring changes, ideally from Coleen if she can find some time.
> But thanks for many small improvement proposals. I've changed the following things:
> - Add null check in ClassLoaderDataGraph::is_valid.
> - Removed inclusion of "memory/metaspace/virtualSpaceNode.hpp" from metaspace.hpp.
> - Use a constant minimum_page_size = 4 * K for the is_readable_range check loop and to sort out NULL or small integer values for class or oop checks.
> - Avoid bytewise iteration for range check loops.
> - Removed some unnecessary uintptr casts.
> - Removed is_range_in_committed_c_heap which was only used once (and the name was indeed misleading).
> - Changed is_oop_nor_null to directly check for valid oop first.
> - Overworked alignment checks (also using sizeof(MetaWord)).
> - Added missing include.
> - Added more verbose version for uncompressed klass pointer.
> Please note that dump_code_blob is just factored out and I don't want to change it in the same changeset because that wouldn't be diffable.
> Some of the functions may also be useful for debugging purposes, so they may be valuable as separate function though only used by one call site.
> New webrev:
> http://cr.openjdk.java.net/~mdoerr/8210754_print_location/webrev.01/
> I will need to inject errors to retest this changed version.
> Best regards,
> Martin

More information about the hotspot-runtime-dev mailing list