RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive

Jiangli Zhou jiangli.zhou at oracle.com
Fri Aug 11 18:29:48 UTC 2017


Thank you, Coleen!

Jiangli

> On Aug 11, 2017, at 11:22 AM, coleen.phillimore at oracle.com wrote:
> 
> 
> These incremental changes look good to me.
> Thanks,
> Coleen
> 
> On 8/10/17 7:51 PM, Jiangli Zhou wrote:
>> Thanks, Ioi!
>> 
>> Jiangli
>> 
>>> On Aug 10, 2017, at 2:15 PM, Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>> wrote:
>>> 
>>> Hi Jiangli,
>>> 
>>> 
>>> The changes look good to me. Thanks for considering my suggestions.
>>> 
>>> 
>>> - Ioi
>>> 
>>> On 8/8/17 5:33 PM, Jiangli Zhou wrote:
>>>> Here is the incremental webrev that has all the changes incorporated with suggestions from Coleen, Ioi and Thomas:
>>>> 
>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.hotspot.03.inc/>
>>>> 
>>>> Updated full webrev: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03/ <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.hotspot.03/>
>>>> 
>>>> Thanks again for Coleen's, Ioi's and Thomas’ review!
>>>> Jiangli
>>>> 
>>>>> On Aug 7, 2017, at 7:57 PM, Jiangli Zhou <jiangli.zhou at Oracle.COM <mailto:jiangli.zhou at Oracle.COM>> wrote:
>>>>> 
>>>>> Hi Ioi,
>>>>> 
>>>>> Thanks for getting back to me.
>>>>> 
>>>>>> On Aug 7, 2017, at 5:45 PM, Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>> wrote:
>>>>>> 
>>>>>> On 8/4/17 10:19 PM, Jiangli Zhou wrote:
>>>>>>> Hi Ioi,
>>>>>>> 
>>>>>>> Thanks for looking again.
>>>>>>> 
>>>>>>>> On Aug 4, 2017, at 2:22 PM, Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>> wrote:
>>>>>>>> 
>>>>>>>> Hi Jiangli,
>>>>>>>> 
>>>>>>>> The code looks good in general. I just have a few pet peeves for readability:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts
>>>>>>>> 
>>>>>>>>  704   assert(UseG1GC, "Only support G1 GC");
>>>>>>>>  705   assert(UseCompressedOops && UseCompressedClassPointers,
>>>>>>>>  706          "Only support UseCompressedOops and UseCompressedClassPointers enabled");
>>>>>>>> 
>>>>>>>> 1615   assert(UseG1GC, "Only support G1 GC");
>>>>>>>> 1616   assert(UseCompressedOops && UseCompressedClassPointers,
>>>>>>>> 1617          "Only support UseCompressedOops and UseCompressedClassPointers enabled");
>>>>>>>> 
>>>>>>>> Maybe it's better to combine them into a single function like MetaspaceShared::assert_vm_flags() so they don't get out of sync?
>>>>>>> 
>>>>>>> There is a MetaspaceShared::allow_archive_heap_object(), which checks for UseG1GC, UseCompressedOops and UseCompressedClassPointers combined. It does not seem to worth add another separate API for asserting the required flags. I’ll use that in the assert. 
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> (2) FileMapInfo::write_archive_heap_regions()
>>>>>>>> 
>>>>>>>> I still find this code very hard to read, especially due to the loop.
>>>>>>>> 
>>>>>>>> First, the comments are not consistent with the code:
>>>>>>>> 
>>>>>>>>     498   assert(arr_len <= max_num_regions, "number of memory regions exceeds maximum");
>>>>>>>> 
>>>>>>>> but the comments says: "The rest are consecutive full GC regions" which means there's a chance for max_num_regions to be more than 2 (which will be the case with Calvin's java-loader dumping changes using very small heap size). So the code is actually wrong.
>>>>>>> 
>>>>>>> The max_num_regions is the maximum number of region for each archived heap space (the string space,                                                 or open archive space). We only run into the case where the MemRegion array size is larger than max_num_regions with Calvin’s pending change. As part of Calvin’s change, he will change the assert into a check and bail out if the number of MemRegions are larger than max_num_regions due to heap fragmentation.
>>>>>>> 
>>>>>>> 
>>>>>> Your latest patch assumes that arr_len <= 2, but the implementation of G1CollectedHeap::heap()->begin_archive_alloc_range() / G1CollectedHeap::heap()->end_archive_alloc_range() actually allows more than 2 regions to returned. So simply putting an assert there seems risky (unless you have analyzed all possible scenarios to prove that's impossible).
>>>>>> 
>>>>>> Instead of trying to come up with a complicated proof, I think it's much safer to disable the archived string region if the arr_len > 2. Also, if the string region is disabled, you should also disable the open_archive_heap_region
>>>>>> 
>>>>>> I think this is a general issue with the mapped heap regions, and it just happens to be revealed by Calvin's patch. So we should fix it now and not wait for Calvin's patch.
>>>>> 
>>>>> Ok. I’ll change the assert to be a check.
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>> 
>>>>>>>> The word "region" is used in these parameters, but they don't mean the same thing.
>>>>>>>> 
>>>>>>>>               GrowableArray<MemRegion> *regions
>>>>>>>>               int first_region, int max_num_regions,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> How about regions      -> g1_regions_list
>>>>>>>>           first_region -> first_region_in_archive
>>>>>>> 
>>>>>>> The GrowableArray above is the MemRegions that GC code gives back to us. The GC code combines multiple G1 regions. The comments probably are over-explaining the details, which are hidden in the GC code. Probably that’s the confusing source. I’ll make the comment more clear. 
>>>>>>> 
>>>>>>> Using g1_regions_list would also be confusing, since write_archive_heap_regions does not handle G1 regions directly. It processes the MemRegion array that GC code returns. How about changing ‘regions’ to ‘mem_regions’ or ‘archive_regions'?
>>>>>>> 
>>>>>> How about heap_regions? These are regions in the active Java heap, which current has not mapped anything from the CDS archive.
>>>>> 
>>>>> Ok.
>>>>> 
>>>>> I’m updating my changes and will send out a consolidated webrev.
>>>>> 
>>>>> Thanks!
>>>>> Jiangli
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> In the comments, I find the phrase 'the current archive heap region' ambiguous. It could be (erroneously) interpreted as "a region from the currently mapped archive”
>>>>>>> 
>>>>>>>> 
>>>>>>>> To make it unambiguous, how about changing
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  464 // Write the current archive heap region, which contains one or multiple GC(G1) regions.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> to
>>>>>>>> 
>>>>>>>>     // Write the given list of G1 memory regions into the archive, starting at 
>>>>>>>>     // first_region_in_archive.
>>>>>>> 
>>>>>>> 
>>>>>>> Ok. How about the following:
>>>>>>> 
>>>>>>> // Write the given list of java heap memory regions into the archive, starting at 
>>>>>>> // first_region_in_archive.
>>>>>>> 
>>>>>> Sounds good.
>>>>>> 
>>>>>> Thanks
>>>>>> - Ioi
>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Also, for the explanation of how the G1 regions are written into the archive, how about:
>>>>>>>> 
>>>>>>>>    // The G1 regions in the list are sorted in ascending address order. When there are more objects
>>>>>>>>    // than the capacity of a single G1 region, the bottom-most G1 region may be partially filled, and the
>>>>>>>>    // remaining G1 region(s) are consecutively allocated and fully filled.
>>>>>>>>    //
>>>>>>>>    // Thus, the bottom-most G1 region (if not empty) is written into first_region_in_archive.
>>>>>>>>    // The remaining G1 regions (if exist) are coalesced and written as a single block
>>>>>>>>    // into (first_region_in_archive + 1) 
>>>>>>>> 
>>>>>>>>    // Here's the mapping from (g1 regions) -> (archive regions).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> All this function needs to do is to decide the values for 
>>>>>>>> 
>>>>>>>>      r0_start, r0_top
>>>>>>>>      r1_start, r1_top
>>>>>>>> 
>>>>>>>> I think it would be much better to not use the loop, and not use the max_num_regions parameter (it's always 2 anyway).
>>>>>>>> 
>>>>>>>>      *r0_start = *r0_top = NULL;
>>>>>>>>      *r1_start = *r1_top = NULL;
>>>>>>>> 
>>>>>>>>      if (arr_len >= 1) {
>>>>>>>>          *r0_start = regions->at(0).start();
>>>>>>>>          *r0_end = *r0_start + regions->at(0).byte_size();
>>>>>>>>      }
>>>>>>>>      if (arr_len >= 2) {
>>>>>>>>          int last = arr_len - 1;
>>>>>>>>          *r1_start = regions->at(1).start();
>>>>>>>>          *r1_end = regions->at(last).start() + regions->at(last).byte_size();
>>>>>>>>       }
>>>>>>>> 
>>>>>>>> what do you think?
>>>>>>> 
>>>>>>> We need to write out all archive regions including the empty ones. The loop using max_num_regions is the easiest way. I’d like to remove the code that deals with r0_* and r1_ explicitly. Let me try that.
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> (3) metaspace.cpp
>>>>>>>> 
>>>>>>>> 3350         // Map the archived heap regions after compressed pointers
>>>>>>>> 3351         // because it relies on compressed class pointers setting to work
>>>>>>>> 
>>>>>>>> do you mean this?
>>>>>>>> 
>>>>>>>>     // Archived heap regions depend on the parameters of compressed class pointers, so
>>>>>>>>     // they must be mapped after such parameters have been decided in the above call.
>>>>>>> 
>>>>>>> Hmmm, maybe use ‘arguments’ instead of ‘parameters’?
>>>>>>>  
>>>>>>>> 
>>>>>>>> 
>>>>>>>> (4) I found this name not strictly grammatical. How about this:
>>>>>>>> 
>>>>>>>>      allow_archive_heap_object -> is_heap_object_archiving_allowed
>>>>>>> 
>>>>>>> Ok.
>>>>>>> 
>>>>>>>> 
>>>>>>>> (5) in most of your code, 'archive' is used as a noun, except in StringTable::archive_string() where it's used as a verb.
>>>>>>>> 
>>>>>>>> archive_string could also be interpreted erroneously as "return a string that's already in the archive". So to be consistent and unambiguous, I think it's better to rename it to StringTable::create_archived_string()
>>>>>>> 
>>>>>>> Ok.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Jiangli
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> - Ioi
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 8/3/17 5:15 PM, Jiangli Zhou wrote:
>>>>>>>>> Here are the updated webrevs.
>>>>>>>>> 
>>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.hotspot.02/>
>>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/ <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.whitebox.02/>
>>>>>>>>> 
>>>>>>>>> Changes in the updated webrevs include:
>>>>>>>>> Merge with Ioi’s recent shared space auto-sizing change (8072061)
>>>>>>>>> Addressed all feedbacks from Ioi and Coleen (Thanks for detailed review!)
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Jiangli
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou <jiangli.zhou at Oracle.COM> <mailto:jiangli.zhou at Oracle.COM> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Ioi,
>>>>>>>>>> 
>>>>>>>>>> Thank you so much for reviewing this. I’ve addressed all your feedbacks. Please see details below. I’ll updated the webrev after addressing Coleen’s comments.
>>>>>>>>>> 
>>>>>>>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam <ioi.lam at oracle.com> <mailto:ioi.lam at oracle.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>> 
>>>>>>>>>>> Here are my comments. I've not reviewed the GC code and I'll leave that to the GC experts :-)
>>>>>>>>>>> 
>>>>>>>>>>> stringTable.cpp: StringTable::archive_string
>>>>>>>>>>> 
>>>>>>>>>>>    add assert for DumpSharedSpaces only
>>>>>>>>>> 
>>>>>>>>>> Ok.
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> filemap.cpp
>>>>>>>>>>> 
>>>>>>>>>>> 525 void FileMapInfo::write_archive_heap_regions(GrowableArray<MemRegion> *regions,
>>>>>>>>>>> 526                                              int first_region, int num_regions) {
>>>>>>>>>>> 
>>>>>>>>>>> When I first read this function, I found it hard to follow, especially this part that coalesces the trailing regions:
>>>>>>>>>>> 
>>>>>>>>>>> 537         int len = regions->length();
>>>>>>>>>>> 538         if (len > 1) {
>>>>>>>>>>> 539           start = (char*)regions->at(1).start();
>>>>>>>>>>> 540           size = (char*)regions->at(len - 1).end() - start;
>>>>>>>>>>> 541         }
>>>>>>>>>>> 542       }
>>>>>>>>>>> 
>>>>>>>>>>> The rest of filemap.cpp always perform identical operations on MemRegion arrays, which are either 1                                                           or 2 in size. However, this function doesn't follow that pattern; it also has a very different notion of "region", and the confusing part is regions->size() is not the same as num_regions.
>>>>>>>>>>> 
>>>>>>>>>>> How about we change the API to something like the following? Before calling this API, the caller needs to coalesce the trailing G1 regions into a single MemRegion.
>>>>>>>>>>> 
>>>>>>>>>>>     FileMapInfo::write_archive_heap_regions(MemRegion *regions, int first, int num_regions) {
>>>>>>>>>>>        if (first == MetaspaceShared::first_string) {
>>>>>>>>>>>           assert(num_regons <=  MetaspaceShared::max_strings, "...");
>>>>>>>>>>>        } else {
>>>>>>>>>>>           assert(first == MetaspaceShared::first_open_archive_heap_region, "...");
>>>>>>>>>>>           assert(num_regons <= MetaspaceShared::max_open_archive_heap_region, "...");
>>>>>>>>>>> }
>>>>>>>>>>>        ....
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> I’ve reworked the function and simplified the code. 
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 756   if (!string_data_mapped) {
>>>>>>>>>>> 757     StringTable::ignore_shared_strings(true);
>>>>>>>>>>> 758     assert(string_ranges == NULL && num_string_ranges == 0, "sanity");
>>>>>>>>>>> 759   }
>>>>>>>>>>> 760
>>>>>>>>>>> 761   if (open_archive_heap_data_mapped) {
>>>>>>>>>>> 762     MetaspaceShared::set_open_archive_heap_region_mapped();
>>>>>>>>>>> 763   } else {
>>>>>>>>>>> 764     assert(open_archive_heap_ranges == NULL && num_open_archive_heap_ranges == 0, "sanity");
>>>>>>>>>>> 765   }
>>>>>>>>>>> 
>>>>>>>>>>> Maybe the two "if" statements should be more consistent? Instead of StringTable::ignore_shared_strings, how about StringTable::set_shared_strings_region_mapped()?
>>>>>>>>>> 
>>>>>>>>>> Fixed.
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> FileMapInfo::map_heap_data() --
>>>>>>>>>>> 
>>>>>>>>>>> 818     char* addr = (char*)regions[i].start();
>>>>>>>>>>> 819     char* base = os::map_memory(_fd, _full_path, si->_file_offset,
>>>>>>>>>>> 820                                 addr, regions[i].byte_size(), si->_read_only,
>>>>>>>>>>> 821                                 si->_allow_exec);
>>>>>>>>>>> 
>>>>>>>>>>> What happens when the first region succeeds to map but the second region fails to map? Will both regions be unmapped? I don't see where you store the return value (base) from os::map_memory(). Does it mean the code assumes that (addr == base). If so, we need an assert here.
>>>>>>>>>> 
>>>>>>>>>> If any of the region fails to map, we bail out and call dealloc_archive_heap_regions(), which handles the deallocation of any regions specified. If second region fails to map, all memory ranges specified by                                                           ‘regions’ array are deallocated. We don’t unmap the memory here since it is part of the java heap. Unmapping of heap memory are handled by GC code. The ‘if’ check below makes sure base == addr.
>>>>>>>>>> 
>>>>>>>>>>     if (base == NULL || base != addr) {
>>>>>>>>>>       // dealloc the regions from java heap
>>>>>>>>>>       dealloc_archive_heap_regions(regions, region_num);
>>>>>>>>>>       if (log_is_enabled(Info, cds)) {
>>>>>>>>>>         log_info(cds)("UseSharedSpaces: Unable to map at required address in java heap.");
>>>>>>>>>>       }
>>>>>>>>>>       return false;
>>>>>>>>>>     }
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> constantPool.cpp
>>>>>>>>>>> 
>>>>>>>>>>>     Handle refs_handle;
>>>>>>>>>>>     ...
>>>>>>>>>>>     refs_handle = Handle(THREAD, (oop)archived);
>>>>>>>>>>> 
>>>>>>>>>>> This will first create a NULL handle, then construct a temporary handle, and then assign the temp handle back to the null handle. This means two handles will be pushed onto THREAD->metadata_handles()
>>>>>>>>>>> 
>>>>>>>>>>> I think it's more efficient if you merge these into a single statement
>>>>>>>>>>> 
>>>>>>>>>>>     Handle refs_handle(THREAD, (oop)archived);
>>>>>>>>>> 
>>>>>>>>>> Fixed.
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Is this experimental code? Maybe it should be removed?
>>>>>>>>>>> 
>>>>>>>>>>> 664     if (tag_at(index).is_unresolved_klass()) {
>>>>>>>>>>> 665 #if 0
>>>>>>>>>>> 666       CPSlot entry = cp->slot_at(index);
>>>>>>>>>>> 667       Symbol* name = entry.get_symbol();
>>>>>>>>>>> 668       Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD);
>>>>>>>>>>> 669       if (k != NULL) {
>>>>>>>>>>> 670         klass_at_put(index, k);
>>>>>>>>>>> 671       }
>>>>>>>>>>> 672 #endif
>>>>>>>>>>> 673     } else
>>>>>>>>>> 
>>>>>>>>>> Removed.
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> cpCache.hpp:
>>>>>>>>>>> 
>>>>>>>>>>>     u8   _archived_references
>>>>>>>>>>> 
>>>>>>>>>>> shouldn't this be declared as an narrowOop to avoid the type casts when it's used?
>>>>>>>>>> 
>>>>>>>>>> Ok. 
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> cpCache.cpp:
>>>>>>>>>>> 
>>>>>>>>>>>    add assert so that one of these is used only at dump time and the other only at run time?
>>>>>>>>>>> 
>>>>>>>>>>> 610 oop ConstantPoolCache::archived_references() {
>>>>>>>>>>> 611   return oopDesc::decode_heap_oop((narrowOop)_archived_references);
>>>>>>>>>>> 612 }
>>>>>>>>>>> 613
>>>>>>>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) {
>>>>>>>>>>> 615   _archived_references = (u8)oopDesc::encode_heap_oop(o);
>>>>>>>>>>> 616 }
>>>>>>>>>> 
>>>>>>>>>> Ok.
>>>>>>>>>> 
>>>>>>>>>> Thanks!
>>>>>>>>>> 
>>>>>>>>>> Jiangli
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thanks!
>>>>>>>>>>> - Ioi
>>>>>>>>>>> 
>>>>>>>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote:
>>>>>>>>>>>> Sorry, the mail didn’t handle the rich text well. I fixed the format below.
>>>>>>>>>>>> 
>>>>>>>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive). Currently, the CDS archive can contain cached class metadata, interned java.lang.String objects. This RFE adds the constant pool ‘resolved_references’ arrays (hotspot specific) to the archive for startup/runtime performance enhancement. The ‘resolved_references' arrays are used to hold references of resolved constant pool entries including Strings, mirrors, etc. With the 'resolved_references’ being cached, string constants in shared classes can now be resolved to existing interned java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required.
>>>>>>>>>>>> 
>>>>>>>>>>>> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. Part of the changes were contributed by Thomas himself.
>>>>>>>>>>>> RFE:        https://bugs.openjdk.java.net/browse/JDK-8179302 <https://bugs.openjdk.java.net/browse/JDK-8179302>
>>>>>>>>>>>> hotspot:   http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/ <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.hotspot.01/>
>>>>>>>>>>>> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/ <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.whitebox.01/>
>>>>>>>>>>>> 
>>>>>>>>>>>> Please see below for details of supporting cached ‘resolved_references’ and pre-resolving string constants.
>>>>>>>>>>>> 
>>>>>>>>>>>> Types of Pinned G1 Heap Regions
>>>>>>>>>>>> 
>>>>>>>>>>>> The pinned region type is a super type of all archive region types, which include the open archive type and the closed archive type.
>>>>>>>>>>>> 
>>>>>>>>>>>> 00100 0 [ 8] Pinned Mask
>>>>>>>>>>>> 01000 0 [16] Old Mask
>>>>>>>>>>>> 10000 0 [32] Archive Mask
>>>>>>>>>>>> 11100 0 [56] Open Archive:   ArchiveMask | PinnedMask | OldMask
>>>>>>>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Pinned Regions
>>>>>>>>>>>> 
>>>>>>>>>>>> Objects within the region are 'pinned', which means GC does not move any live objects. GC scans and marks objects in the pinned region as normal, but skips forwarding live objects. Pointers in live objects are updated. Dead objects (unreachable) can be collected and freed.
>>>>>>>>>>>> 
>>>>>>>>>>>> Archive Regions
>>>>>>>>>>>> 
>>>>>>>>>>>> The archive types are sub-types of 'pinned'. There are two types of archive region currently, open archive and closed archive. Both can support caching java heap objects via the CDS archive.
>>>>>>>>>>>> 
>>>>>>>>>>>> An archive region is also an old region by design.
>>>>>>>>>>>> 
>>>>>>>>>>>> Open Archive (GC-RW) Regions
>>>>>>>>>>>> 
>>>>>>>>>>>> Open archive region is GC writable. GC scans & marks objects within the region and adjusts (updates) pointers in live objects the same way as a pinned region. Live objects (reachable) are pinned and not forwarded by GC.
>>>>>>>>>>>> Open archive region does not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects are not collected and freed by GC.
>>>>>>>>>>>> 
>>>>>>>>>>>> Adjustable Outgoing Pointers
>>>>>>>>>>>> 
>>>>>>>>>>>> As GC can adjust pointers within the live objects in open archive heap region, objects can have outgoing pointers to another java heap region, including closed archive region, open archive region, pinned (or humongous) region, and normal generational region. When a referenced object is moved by GC, the pointer within the open archive region is updated accordingly.
>>>>>>>>>>>> 
>>>>>>>>>>>> Closed Archive (GC-RO) Regions
>>>>>>>>>>>> 
>>>>>>>>>>>> The closed archive region is GC read-only region. GC cannot write into the region. Objects are not scanned and marked by GC. Objects are pinned and not forwarded. Pointers are not updated by GC either. Hence, objects within the archive region cannot have any outgoing pointers to another java heap region. Objects however can still have pointers to other objects within the closed archive regions (we might allow pointers to open archive regions in the future). That restricts the type of java objects that can be supported by the archive region.
>>>>>>>>>>>> In JDK 9 we support archive Strings with the archive regions.
>>>>>>>>>>>> 
>>>>>>>>>>>> The GC-readonly archive region makes java heap memory sharable among different JVM processes. NOTE: synchronization on the objects within the archive heap region can still cause writes to the memory page.
>>>>>>>>>>>> 
>>>>>>>>>>>> Dormant Objects
>>>>>>>>>>>> 
>>>>>>>>>>>> Dormant objects are unreachable java objects within the open archive heap region.
>>>>>>>>>>>> A java object in the open archive heap region is a live object if it can be reached during scanning. Some of the java objects in the region may not be reachable during scanning. Those objects are considered as dormant, but not dead. For example, a constant pool 'resolved_references' array is reachable via the klass root if its container klass (shared) is already loaded at the time during GC scanning. If a shared klass is not yet loaded, the klass root is not scanned and it's constant pool 'resolved_reference' array (A) in the open archive region is not reachable. Then A is a dormant object.
>>>>>>>>>>>> 
>>>>>>>>>>>> Object State Transition
>>>>>>>>>>>> 
>>>>>>>>>>>> All java objects are initially dormant objects when open archive heap regions are mapped to the runtime java heap. A dormant object becomes live object when the associated shared class is loaded at runtime. Explicit call to G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes live. That should be the case for cached objects with strong roots as well, since strong roots are only scanned at the start of GC marking (the initial marking) but not during Remarking/Final marking. If a cached object becomes live during concurrent marking phase, G1 may not find it and mark it live unless a call to G1SATBCardTableModRefBS::enqueue() is made for the object.
>>>>>>>>>>>> 
>>>>>>>>>>>> Currently, a live object in the open archive heap region cannot become dormant again. This restriction simplifies GC requirement and guarantees all outgoing pointers are updated by GC correctly. Only objects for shared classes from the builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are supported for caching.
>>>>>>>>>>>> 
>>>>>>>>>>>> Caching Java Objects at Archive Dump Time
>>>>>>>>>>>> 
>>>>>>>>>>>> The closed archive and open archive regions are allocated near the top of the dump time java heap. Archived java objects are copied into the designated archive heap regions. For example, String objects and the underlying 'value' arrays are copied into the closed archive regions. All references to the archived objects (from shared class metadata, string table, etc) are set to the new heap locations. A hash table is used to keep track of all archived java objects during the copying process to make sure java object is not archived more than once if reached from different roots. It also makes sure references to the same archived object are updated using the same new address location.
>>>>>>>>>>>> 
>>>>>>>>>>>> Caching Constant Pool resolved_references Array
>>>>>>>>>>>> 
>>>>>>>>>>>> The 'resolved_references' is an array that holds references of resolved constant pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' arrays are copied into the open archive regions during dump process. Prior to copying the 'resolved_references' arrays, JVM iterates through constant pool entries and resolves all JVM_CONSTANT_String entries to existing interned Strings for all archived classes. When resolving, JVM only looks up the string table and finds existing interned Strings without inserting new ones. If a string entry cannot be resolved to an existing interned String, the constant pool entry remain as unresolved. That prevents memory waste                                                           if a constant pool string entry is never used at runtime.
>>>>>>>>>>>> 
>>>>>>>>>>>> All String objects referenced by the string table are copied first into the closed archive regions. The string table entry is updated with the new location when each String object is archived. The JVM updates the resolved constant pool string entries with the new object locations when copying the 'resolved_references' arrays to the open archive regions. References to the 'resolved_references' arrays in the ConstantPoolCache are also updated.
>>>>>>>>>>>> At runtime as part of ConstantPool::restore_unshareable_info() work, call G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is becoming live. A handle is created for the cached object and added to the loader_data's handles.
>>>>>>>>>>>> 
>>>>>>>>>>>> Runtime Java Heap With Cached Java Objects
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> The closed archive regions (the string regions) and open archive regions are mapped to the runtime java heap at the same offsets as the dump time offsets from the runtime java heap base.
>>>>>>>>>>>> 
>>>>>>>>>>>> Preliminary test execution and status:
>>>>>>>>>>>> 
>>>>>>>>>>>> JPRT: passed
>>>>>>>>>>>> Tier2-rt: passed
>>>>>>>>>>>> Tier2-gc: passed
>>>>>>>>>>>> Tier2-comp: passed
>>>>>>>>>>>> Tier3-rt: passed
>>>>>>>>>>>> Tier3-gc: passed
>>>>>>>>>>>> Tier3-comp: passed
>>>>>>>>>>>> Tier4-rt: passed
>>>>>>>>>>>> Tier4-gc: passed
>>>>>>>>>>>> Tier4-comp:6 jobs timed out, all other tests passed
>>>>>>>>>>>> Tier5-rt: one test failed but passed when running locally, all other tests passed
>>>>>>>>>>>> Tier5-gc: passed
>>>>>>>>>>>> Tier5-comp: running
>>>>>>>>>>>> hotspot_gc: two jobs timed out, all other tests passed
>>>>>>>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests passed
>>>>>>>>>>>> vm.gc: passed
>>>>>>>>>>>> vm.gc in CDS mode: passed
>>>>>>>>>>>> Kichensink: passed
>>>>>>>>>>>> Kichensink in CDS mode: passed
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jiangli
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20170811/c295d2c1/attachment.htm>


More information about the hotspot-gc-dev mailing list