RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules

Ioi Lam ioi.lam at oracle.com
Fri Jul 6 00:45:29 UTC 2018

Hi Jiangli,

Thank you so much for working on this. I think it's great that we can 
get the
start-up improvement by archiving the ModuleDescriptor.

I just have some coding style comments regarding heapShared.cpp. This file
contains the code for coping objects and relocating pointers. By its nature,
this kind of code is usually complicated, so I think we should try to make
it as easy to understand as possible.

[1] HeapShared::walk_from_field_and_archiving:

     This name is not grammatically correct. How about

[2] How about changing the parameter field_offset -> static_field_offset
     When I first read the code I was confused whether it's talking
     about static or instance fields. Usually, "field"
     implies instance field, so it's better to specifically
     say "static field".

[3] This code would fail if "f" is already archived.

     473   // get the archived copy of the field referenced object
     474   oop af = MetaspaceShared::archive_heap_object(f, THREAD);
     475   WalkOopAndArchiveClosure walker(1, subgraph_info, f, af);
     476   f->oop_iterate(&walker);

[4] There's duplicated code between walk_from_field_and_archiving and

     403   assert(relocated_k == 
     404          "must be the relocated Klass in the shared space");
     405   _subgraph_info->add_subgraph_object_klass(orig_k, relocated_k);

     - vs -

     484   assert(relocated_k == 
     485          "must be the relocated Klass in the shared space");
     486   subgraph_info->add_subgraph_object_klass(orig_k, relocated_k);

[5] This code  is also duplicated:

     375   RawAccess<IS_NOT_NULL>::oop_store(new_p, archived);
     376   log.print("--- archived copy existing, store archived " 
     377             p2i(archived), p2i(new_p));

     - vs -

     395  RawAccess<IS_NOT_NULL>::oop_store(new_p, archived);
     396  log.print("=== store archived " PTR_FORMAT " in " PTR_FORMAT,
     397            p2i(archived), p2i(new_p));

[6] This code, even though it's correct, is hard to understand --
     why are we calculating the distance between the two objects?

     368  size_t delta = pointer_delta((HeapWord*)_archived_referencing_obj,
     369 (HeapWord*)_orig_referencing_obj);
     370  T* new_p = (T*)((HeapWord*)p + delta);

     I thin it would be easier to understand if we change the order of the
     two arithmetic operations:

     // new_p is the address of the same field inside 
     size_t field_offset_in_bytes = pointer_delta(p, 
_orig_referencing_obj, 1);
     T* new_p = (T*)(address(_orig_referencing_obj) + 

[7] I have a hard time understand this log:

     376   log.print("--- archived copy existing, store archived " 
     377             p2i(archived), p2i(new_p));

     How about this?

     log.print("--- updated embedded pointer @[" PTR_FORMAT "] => " 
               p2i(new_p), p2i(archived));

For your consideration, I've incorporated my comments above into 
I've not tested it so it most likely won't build :-(

http://cr.openjdk.java.net/~iklam/misc/heapShared.old.cpp  [your version]
http://cr.openjdk.java.net/~iklam/misc/heapShared.new.cpp  [my version]

Please take a look and see if you like it.

- Ioi

On 6/28/18 4:15 PM, Jiangli Zhou wrote:
> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning).
> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering.
> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests.
> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/
> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921
> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev.
> Following are the details of system module archiving, which are duplicated in above bug report.
> ---------------------------------------------------------------------------------------------------------------------------
> Support archiving system module graph when the initial module is unnamed module from -cp currently.
> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers.
> Dump time system module object archiving
> =================================
> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving.
>   private static SystemModules archivedSystemModules;
>   private static ModuleFinder archivedSystemModuleFinder;
>   private static String archivedMainModule;
> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references.
> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror.
> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step.
> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object.
> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object.
> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime.
> Runtime initialization from archived system module objects
> ============================================
> VM.initializeFromArchive(<class>) is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly.
> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal.
> In non-CDS mode, VM.initializeFromArchive(<class>) returns immediately with minimum added overhead for normal execution.
> Thanks,
> Jiangli

More information about the hotspot-runtime-dev mailing list