PING: RFC for new JEP: Reduce metaspace waste by dynamically merging and splitting metaspace chunks.

Coleen Phillimore coleen.phillimore at oracle.com
Tue Oct 25 12:01:11 UTC 2016



On 10/25/16 3:39 AM, Thomas Stüfe wrote:
> Hi Coleen,
>
> thank you for feedback and encouragement :) See further comments inline.
>
> On Mon, Oct 24, 2016 at 6:32 PM, Coleen Phillimore 
> <coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>> 
> wrote:
>
>
>     Hi Thomas,
>
>     I agree with Erik.  If this works well for you, then it should
>     just be implemented without an option.   If done early in JDK10,
>     it'll get a lot of good testing.
>
>
> Ok. Was hoping to get this into JDK9, but if I am thinking clearly 
> about this, I see that maybe the risk is too large. So lets implement 
> this in 10 and if it is stable and works well, it can be backported to 
> jdk9, yes?

That seems a better plan.  Even though JDK9 slipped for jigsaw, we are 
keeping to the original FC date for everything else, so that it'll be 
very stable when it ships.

JDK10 should open soon.
>
>
>     This looks like a very good improvement.   We had discussed
>     coalescing blocks and other improvements like this early on, but
>     wanted to wait to see problems in the field to motivate the
>     changes.   We've seen these sorts of problems now too.
>
>     One of the things we've considered is that we wanted to use the
>     operating system's version of malloc for the chunks, so that it
>     can split and coalesce chunks for us. The Solaris malloc is
>     improved recently.  But I don't think it's time to make change to
>     use malloc'ed chunks yet because we have to consider all of the
>     operating systems that we and the OpenJDK community supports.
>
>
> I do not see how you could get CompressedClassPointers to work with 
> native malloc? You would have to be sure that the pointers returned by 
> malloc are within the numerical range for the 32bit class pointers. I 
> thought that was the reason for using a continuous address range when 
> allocating compressed class space.

Oh, yes, you are right.  I thought we saw the too many small chunks 
problem only in the data metaspace, but this would help with class 
metaspace as well.

Thanks,
Coleen

>
>     So, yes, I think the JEP looks good and your slides are absolutely
>     beautiful.  Everyone should see these slides.
>
>
> :) Thanks!
>
> So I will provide a prototype, for now based on jdk9, and we will see 
> how we go from there.
>
> Thanks, and Kind Regards,
>
> Thomas
>
>     Thanks,
>     Coleen
>
>
>
>     On 10/24/16 10:01 AM, Erik Helin wrote:
>
>         On 2016-10-13, Thomas Stüfe wrote:
>
>             Hi Erik,
>
>             On Thu, Oct 13, 2016 at 2:15 PM, Erik Helin
>             <erik.helin at oracle.com <mailto:erik.helin at oracle.com>> wrote:
>
>                 Hi Thomas,
>
>                 thanks for submitting the JEP and proposing this feature!
>
>                 On 2016-10-10, Thomas Stüfe wrote:
>
>                     Hi all,
>
>                     May I have please some feedback for this
>                     enhancement proposal?
>
>                     https://bugs.openjdk.java.net/browse/JDK-8166690
>                     <https://bugs.openjdk.java.net/browse/JDK-8166690>
>
>
>                     In one very short sentence it proposes a better
>                     allocation scheme for
>                     Metaspace Chunks in order to reduce fragmentation
>                     and metaspace waste.
>
>                     I also added a short presentation which describes
>                     the problem and how we
>                     solved it in our VM port.
>
>                     https://bugs.openjdk.java.net/secure/attachment/63894/
>                     <https://bugs.openjdk.java.net/secure/attachment/63894/>
>
>                 Metaspace%20Coalescation%20in%20the%20SAP%20JVM.pdf
>
>                 Do we really need the flag -XX:+CoalesceMetaspace?
>                 Having two differnent
>                 ways to handle the chunk free lists in Metaspace is
>                 unfortunate, it
>                 might introduce hard to detect bugs and will also
>                 require much more
>                 testing (runnings lots of tests with the flag both on
>                 and off).
>
>             You are right. If the new allocator works well, there is
>             no reason to keep
>             the old allocator around.
>
>             We wanted for a temporary time to be able to switch
>             between both old and
>             new allocator. Just to have a fallback if problems occur.
>             But if it works,
>             it makes sense to only have one allocator, and the
>             "CoalesceMetaspace" flag
>             can be removed, and also the code can be made a lot
>             simpler because we do
>             not need both code paths.
>
>         Yeah, I would strongly prefer to not introduce a new flag for
>         this. Have
>         you thought about testing? Do you intend to write new tests to
>         stress
>         the coalescing?
>
>                 Do you think your proposed solution has low enough
>                 overhead (in terms
>                 of CPU and memory) to be on "by default"?
>
>             We decided to switch it on by default in our VM.
>
>             Memory overhead can be almost exactly calculated. Bitmasks
>             take 2 bits per
>             specialized-chunk-sized-area. That means, for
>             specialized-chunk-size = 1k
>             (128 meta words): metaspace size / 8192. So, for 1G of
>             metaspace we pay
>             132KB overhead for the bitmasks, or roughly 0.1%.
>
>             There is some CPU overhead, but in my tests I could not
>             measure anything
>             above noise level.
>
>         Those numbers seems low enough to me in order to not warrant a
>         new flag.
>
>                 Thanks,
>                 Erik
>
>
>             Btw, I understand that it is difficult to estimate this
>             proposal without a
>             prototype to play around. As I already mentioned, the
>             patch right now only
>             exists in our code base and not yet in the OpenJDK. If you
>             guys are
>             seriously interested in this JEP, I will invest the time
>             to port the patch
>             to the OpenJDK, so that you can check it out for yourself.
>
>         Yes, we are seriously interested :) I think the proposal
>         sounds good. I guess
>         the devil will be in the details, so I (we) would really
>         appreciate if
>         you want to port your internal patch to OpenJDK.
>
>         Thanks,
>         Erik
>
>             Kind Regards, Thomas
>
>
>
>
>                     Thank you very much!
>
>                     Kind Regards, Thomas
>
>
>                     On Tue, Sep 27, 2016 at 10:45 AM, Thomas Stüfe
>                     <thomas.stuefe at gmail.com
>                     <mailto:thomas.stuefe at gmail.com>>
>                     wrote:
>
>                         Dear all,
>
>                         please take a look at this Enhancement
>                         Proposal for the metaspace
>                         allocator. I hope these are the right groups
>                         for this discussion.
>
>                         https://bugs.openjdk.java.net/browse/JDK-8166690
>                         <https://bugs.openjdk.java.net/browse/JDK-8166690>
>
>                         Background:
>
>                         We at SAP see at times at customer
>                         installations OOMs in Metaspace
>                         (usually, with compressed class pointers
>                         enabled, in Compressed Class
>                         Space). The VM attempts to allocate metaspace
>                         and fails, hitting the
>                         CompressedClassSpaceSize limit. Note that we
>                         usually set the limit
>
>                 lower
>
>                         than the default, typically at 256M.
>
>                         When analyzing, we observed that a large part
>                         of the metaspace is
>
>                 indeed
>
>                         free but "locked in" into metaspace chunks of
>                         the wrong size: often we
>                         would find a lot of free small chunks, but the
>                         allocation request was
>
>                 for
>
>                         medium chunks, and failed.
>
>                         The reason was that if at some point in time a
>                         lot of class loaders
>
>                 were
>
>                         alive, each with only a few small classes
>                         loaded. This would lead to
>
>                 the
>
>                         metaspace being swamped with lots of small
>                         chunks. This is because each
>                         SpaceManager first allocates small chunks,
>                         only after a certain amount
>
>                 of
>
>                         allocation requests switches to larger chunks.
>
>                         These small chunks are free and wait in the
>                         freelist, but cannot be
>
>                 reused
>
>                         for allocation requests which require larger
>                         chunks, even if they are
>                         physically adjacent in the virtual space.
>
>                         We (at SAP) added a patch which allows
>                         on-the-fly metaspace chunk
>
>                 merging
>
>                         - to merge multiple adjacent smaller chunk to
>                         form a larger chunk.
>
>                 This, in
>
>                         combination with the reverse direction -
>                         splitting a large chunk to get
>                         smaller chunks - partly negates the
>                         "chunks-are-locked-in-into-
>
>                 their-size"
>
>                         limitation and provides for better reuse of
>                         metaspace chunks. It also
>                         provides better defragmentation as well.
>
>                         I discussed this fix off-list with Coleen
>                         Phillimore and Jon Masamitsu,
>                         and instead of just offering this as a fix,
>                         both recommended to open a
>
>                 JEP
>
>                         for this, because its scope would be beyond
>                         that of a simple fix.
>
>                         So here is my first JEP :) I hope it follows
>                         the right form. Please, if
>                         you have time, take a look and tell us what
>                         you think.
>
>                         Thank you, and Kind Regards,
>
>                         Thomas Stüfe
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20161025/c60697a1/attachment.htm>


More information about the hotspot-gc-dev mailing list