A Bug Of Metaspace After Full GC

JC Beyler jcbeyler at google.com
Tue Sep 25 20:22:04 UTC 2018


I did say I was not a GC expert :-)

So the question would be: is the patch I showed based on Nijiaben's patch
worth fixing and is it right? It seems like the right thing to do for at
least fixing SerialGC printouts, no?
Jc

Ps: here is the patch for simplifying things for everyone then:
diff -r 8f66a57054b7 src/hotspot/share/gc/shared/genCollectedHeap.cpp
--- a/src/hotspot/share/gc/shared/genCollectedHeap.cpp  Tue Sep 25 10:30:32
2018 -0700
+++ b/src/hotspot/share/gc/shared/genCollectedHeap.cpp  Tue Sep 25 12:50:54
2018 -0700
@@ -649,7 +649,6 @@
     complete = complete || collected_old;

     print_heap_change(young_prev_used, old_prev_used);
-    MetaspaceUtils::print_metaspace_change(metadata_prev_used);

     // Adjust generation sizes.
     if (collected_old) {
@@ -665,6 +664,7 @@
       MetaspaceGC::compute_new_size();
       update_full_collections_completed();
     }
+    MetaspaceUtils::print_metaspace_change(metadata_prev_used);

     // Track memory usage and detect low memory after GC finishes
     MemoryService::track_memory_usage();

On Tue, Sep 25, 2018 at 1:08 PM Krystal Mok <rednaxelafx at gmail.com> wrote:

> Hi JC,
>
> Just a quick comment on:
>
> So I looked at the other ones as well and as a summary:
>   - The bug seems to show up for Serial, CMS
>   - The bug does not seem to show up for Parallel, G1
>
> That's because Serial (DefNew) / Serial Old (Tenured / GenMarkSweep) /
> ParNew / CMS use the "generational GC framework" that's been there for
> years. That's the "GenCollectedHeap" touched by Nijiaben's propsed patch.
> It was there with the idea that different GC implementations can plug into
> the same framework and supposedly they could mix-and-match. But later on
> only a few possible configurations were really tested well and supported.
>
> Parallel (ParallelScavenge) / Parallel Old and G1 use their own
> implementation of the CollectedHeap interface and doesn't participate in
> that "generational GC framework". So they don't have the same bug at the
> same place.
>
> - Kris
>
> On Tue, Sep 25, 2018 at 12:57 PM, JC Beyler <jcbeyler at google.com> wrote:
>
>> Hi Nijiaben,
>>
>> I'm not a GC engineer but thought I would look at this by curiosity (for
>> note, I also created https://bugs.openjdk.java.net/browse/JDK-8211123
>> for tracking purposes). It seems you are using an older JDK because the
>> code has changed somewhat there.
>>
>> I looked at this and used your reproducer on JDK12 tip:
>>
>> Using CMS (which is being deprecated from what I know and see in the log
>> message):
>> $
>> ~/mercurial/jdk12-gc-print/build/linux-x86_64-server-release/images/jdk/bin/java
>> -XX:+UseConcMarkSweepGC -Xmx1000M -Xms1000M -Xlog:gc* ClassLoaderTest
>> OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was
>> deprecated in version 9.0 and will likely be removed in a future release.
>> ...
>> [0.559s][info][gc,metaspace   ] GC(0) Metaspace: 15267K->15267K(1067008K)
>>
>> (Note: I am using -Xlog:gc* since -XX:+PrintGCDetails is being deprecated
>> as well from what the log message is saying)
>>
>> So the bug is there for CMS, but for G1, it seems to work:
>>
>> $
>> ~/mercurial/jdk12-gc-print/build/linux-x86_64-server-release/images/jdk/bin/java
>> -Xmx1000M -Xms1000M -Xlog:gc* ClassLoaderTest
>> ...
>> [0.653s][info][gc,metaspace   ] GC(0) Metaspace: 15266K->5266K(1062912K)
>> ...
>>
>> So I looked at the other ones as well and as a summary:
>>   - The bug seems to show up for Serial, CMS
>>   - The bug does not seem to show up for Parallel, G1
>>
>>
>> I then updated the code like what you were showing though things have
>> changed in tip so my change became a simple one-line move:
>>
>> diff -r 8f66a57054b7 src/hotspot/share/gc/shared/genCollectedHeap.cpp
>> --- a/src/hotspot/share/gc/shared/genCollectedHeap.cpp  Tue Sep 25
>> 10:30:32 2018 -0700
>> +++ b/src/hotspot/share/gc/shared/genCollectedHeap.cpp  Tue Sep 25
>> 12:50:54 2018 -0700
>> @@ -649,7 +649,6 @@
>>      complete = complete || collected_old;
>>
>>      print_heap_change(young_prev_used, old_prev_used);
>> -    MetaspaceUtils::print_metaspace_change(metadata_prev_used);
>>
>>      // Adjust generation sizes.
>>      if (collected_old) {
>> @@ -665,6 +664,7 @@
>>        MetaspaceGC::compute_new_size();
>>        update_full_collections_completed();
>>      }
>> +    MetaspaceUtils::print_metaspace_change(metadata_prev_used);
>>
>>      // Track memory usage and detect low memory after GC finishes
>>      MemoryService::track_memory_usage();
>>
>> With this, I looked at the various GC results and this fixes:
>>   - Serial, CMS; while keeping Parallel and G1 working (seemingly)
>>
>> Finally, note that all GC logging now are logging the summary at the exit
>> and this line shows up:
>> [0.560s][info][gc,heap,exit   ]  Metaspace       used 5277K, capacity
>> 5310K, committed 15232K, reserved 1062912K
>>
>> For all GCs, so though the line "Metaspace before -> after" is false, you
>> can get the the after with the final line.
>>
>> If the GC experts on this list want me to make a webrev out of this and
>> get reviews for it, I can do that.
>>
>> Hoping that helps,
>> Jc
>>
>>
>>
>> On Fri, Sep 21, 2018 at 11:56 PM nijiaben <nijiaben at perfma.com> wrote:
>>
>>> Hi, all
>>>
>>> We use metaspace instead of perm since JDK8, I found that after the Full
>>> GC, the size of the metaspace has always been the same from the GC log. I
>>> have looked at the source code of hotspot roughly. It has not bean fixed in
>>> the latest version. The problem which may have some impact on us to
>>> troubleshoot some metaspace problems.
>>>
>>> I wrote a test case, and simply analyzed the hotspot code, and showed
>>> the output before and after the modification. I hope this problem can be
>>> fixed as soon as possible.
>>>
>>> Demo:
>>>
>>> import java.io.File;
>>> import java.net.URL;
>>> import java.net.URLClassLoader;
>>>
>>> /**
>>>  *
>>>  * @author nijiaben
>>>  * @version v1.0
>>>  * @createTime 2018年09月22日 12:32:32 PM
>>>  *
>>>  */
>>> public class ClassLoaderTest {
>>>     private static URL[] urls = new URL[1];
>>>
>>>     static {
>>>         try {
>>>             File jarFile = new File("/home/admin/.m2/repository/org/slf4j/slf4j-api/1.7.21/slf4j-api-1.7.21.jar");
>>>             urls[0] = jarFile.toURI().toURL();
>>>         } catch (Exception e) {
>>>             e.printStackTrace();
>>>         }
>>>     }
>>>
>>>     public static void main(String args[]) {
>>>         for(int i=0;i<1000;i++) {
>>>             loadClass();
>>>         }
>>>         System.gc();
>>>     }
>>>
>>>     public static void loadClass() {
>>>         try {
>>>             URLClassLoader ucl = new URLClassLoader(urls);
>>>             Class.forName("org.slf4j.Logger", false, ucl);
>>>         } catch (Exception e) {
>>>             e.printStackTrace();
>>>         }
>>>     }
>>> }
>>>
>>> My demo is very simple, just constantly create a new class loader to
>>> load a specific class, after loading, execute a System GC, so that we can
>>> see the relevant GC log output, the JVM parameters we run are
>>>
>>> -Xmx1000M -Xms1000M -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails ClassLoaderTest
>>>
>>> The GC log is
>>>
>>> [Full GC (System.gc()) [CMS: 0K->9921K(683264K), 0.1003988 secs] 49083K->9921K(989952K), [Metaspace: 12916K->12916K(1064960K)], 0.1060065 secs] [Times: user=0.06 sys=0.05, real=0.11 secs]
>>>
>>> Let's look at the source code of hotspot,
>>>
>>> GenCollectedHeap::do_collection:
>>>
>>> if (PrintGCDetails) {
>>>       print_heap_change(gch_prev_used);
>>>
>>>       // Print metaspace info for full GC with PrintGCDetails flag.
>>>       if (complete) {
>>>         MetaspaceAux::print_metaspace_change(metadata_prev_used);
>>>       }
>>>     }
>>>
>>>     for (int j = max_level_collected; j >= 0; j -= 1) {
>>>       // Adjust generation sizes.
>>>       _gens[j]->compute_new_size();
>>>     }
>>>
>>>     if (complete) {
>>>       // Delete metaspaces for unloaded class loaders and clean up loader_data graph
>>>       ClassLoaderDataGraph::purge();
>>>       MetaspaceAux::verify_metrics();
>>>       // Resize the metaspace capacity after full collections
>>>       MetaspaceGC::compute_new_size();
>>>       update_full_collections_completed();
>>>     }
>>>
>>> We see that metaspace's gc log output is executed before
>>> ClassLoaderDataGraph::purge(); , that means the size of the metaspace
>>> is printed without reclaiming the memory of the metaspace, so the GC log
>>> seems no change before and after the GC metaspace.
>>>
>>> After knowing the specific problem, we can make an adjustment to the
>>> code to move the logic of the output of metaspace size after to
>>> ClassLoaderDataGraph::purge(),just like this:
>>>
>>>     if (PrintGCDetails) {
>>>       print_heap_change(gch_prev_used);
>>>     }
>>>
>>>     for (int j = max_level_collected; j >= 0; j -= 1) {
>>>       // Adjust generation sizes.
>>>       _gens[j]->compute_new_size();
>>>     }
>>>
>>>     if (complete) {
>>>       // Delete metaspaces for unloaded class loaders and clean up loader_data graph
>>>       ClassLoaderDataGraph::purge();
>>>       MetaspaceAux::verify_metrics();
>>>       // Resize the metaspace capacity after full collections
>>>       MetaspaceGC::compute_new_size();
>>>       update_full_collections_completed();
>>>         if (PrintGCDetails) {
>>>             // Print metaspace info for full GC with PrintGCDetails flag.
>>>             MetaspaceAux::print_metaspace_change(metadata_prev_used);
>>>         }
>>>     }
>>>
>>> At this point we recompile hotspot, execute the above demo again, you
>>> can see the correct output as follows
>>>
>>> [Full GC (System.gc()) [CMS: 0K->9921K(683264K), 0.0852627 secs] 49083K->9921K(989952K), [Metaspace: 12913K->3280K(1058816K)], 0.0891207 secs] [Times: user=0.05 sys=0.04, real=0.09 secs]
>>>
>>> I also probably thought about why this problem is caused. This is the
>>> version before JDK8, such as JDK7, you can see the same code location:
>>>
>>> if (PrintGCDetails) {
>>>       print_heap_change(gch_prev_used);
>>>
>>>       // Print perm gen info for full GC with PrintGCDetails flag.
>>>       if (complete) {
>>>         print_perm_heap_change(perm_prev_used);
>>>       }
>>>     }
>>>
>>>     for (int j = max_level_collected; j >= 0; j -= 1) {
>>>       // Adjust generation sizes.
>>>       _gens[j]->compute_new_size();
>>>     }
>>>
>>>     if (complete) {
>>>       // Ask the permanent generation to adjust size for full collections
>>>       perm()->compute_new_size();
>>>       update_full_collections_completed();
>>>     }
>>>
>>> It can be seen that the GC engineer at that time simply replaced
>>> print_perm_heap_change(perm_prev_used);with
>>> MetaspaceAux::print_metaspace_change(metadata_prev_used);but forgot the
>>> difference between Perm and Metaspace.
>>>
>>> Thanks,
>>>
>>> nijiaben
>>>
>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20180925/ef24ecd1/attachment.html>


More information about the hotspot-gc-dev mailing list