Jon Masamitsu jon.masamitsu at oracle.com
Thu Sep 15 19:02:27 UTC 2016

In addition to Zhengyu's advice, turn on -XX:+PrintGCDetails (if not already
turned on) so we can see what GC is doing.


On 9/15/2016 11:31 AM, Zhengyu Gu wrote:
> Hi Steven,
> On 09/15/2016 01:17 PM, Steven Schlansker wrote:
>> Hi hotspot-dev,
>> Hopefully I found an appropriate mailing list.  Let me know if I 
>> should be asking elsewhere.
>> We run OpenJDK 8u91 inside of Linux containers.  One of the 
>> challenges we've faced is
>> ensuring that the container memory limits don't kill our Java 
>> processes unexpectedly --
>> heap sizing is relatively easy, but there's a number of other regions 
>> the JVM uses that
>> aren't as easy to account for.
>> We use Native Memory Tracking and export the statistics. Plotting the 
>> "class committed" NMT
>> metric:
>> You'll notice what looks like a very slow memory leak.  We've 
>> confirmed with "-verbose:class" that we are
>> are not loading many new classes - this graph starts a day after 
>> launch, so the application should have long
>> reached a relatively steady state.  The jump at 9/15 08:00 was due to 
>> classloading relating to attaching
>> JMX monitoring.  The long slow rise though we can't account for.
>> Eventually the application exceeds its container memory bound and is 
>> SIGKILLed by the kernel.
>> We are in the process of iteratively raising the limit, but it's 
>> unclear how large this class
>> space could grow.  We've observed some evidence that it can be GCed 
>> eventually, but it's not
>> clear what prompts it or how we'd encourage it to happen more often.
>> Here are our relevant JVM options:
>> -XX:+AlwaysPreTouch
>> -XX:MaxMetaspaceSize=64m
>> -XX:CompressedClassSpaceSize=32m
>> -XX:ReservedCodeCacheSize=64m
>> -XX:ParallelGCThreads=4
>> -XX:+UseConcMarkSweepGC
>> -XX:+DisableExplicitGC
>> -XX:NativeMemoryTracking=summary
>> -Xmx2100m
>> -Xms2100m
>> Notice that we set a 32m limit on "compressed class space size" -- 
>> which is apparently not the same as
>> "class" usage in NMT?  Or the limit isn't effective?
> NMT counts many runtime data structures that associate with classes in 
> "class" category,
> such as hashtable for class lookup, class loaders and etc.
>> The questions I am trying to answer:
>> * What causes this long slow rise of "class" usage?  It almost looks 
>> like a leak.
>> * How do we limit this native memory region?  We're trying to set 
>> absolute limits; we'd much prefer a
>>    Java OutOfMemoryError than a visit from the kernel OOM killer
> They are different type of OOM.
> Java OutOfMemoryError is due to running out of Java heap space, which 
> is 2100m in your case.
> I think the memory leak that you are talking about here, is native 
> memory, which is outside of Java heap.
>> * It'd be nice to signal to the JVM "You have this much memory total, 
>> and not a byte more" and have
>>    the other tuneables set sensibly based on that value.  Maybe I'm 
>> dreaming.  This also isn't a question.
>> Any other tips on diagnosing this sort of issue also appreciated.
> To track down exactly where leaks memory, you can use detail tracking 
> option (-XX:NativeMemoryTracking=detail)
> After application starts, use NMT "baseline" command to establish an 
> early memory baseline, then you can issue
> detail.diff command periodically to see native memory activities over 
> time.
> If there are memory leaks, you should be able to see that some call 
> sites have increasing memory allocations.
> Hope this helps.
> Thanks,
> -Zhengyu
>> Thanks in advance,
>> Steven

