RFR: 8213175 - MaxHeapSize refuses to go over 32178700288 when running in a docker container

Thomas Schatzl thomas.schatzl at oracle.com
Thu Mar 14 11:19:10 UTC 2019


On Wed, 2019-03-13 at 08:35 -0400, Bob Vandette wrote:
> > On Mar 13, 2019, at 12:40 AM, David Holmes <david.holmes at oracle.com
> > > wrote:
> > 
> > Hi Bob,
> > 
> > On 13/03/2019 1:04 am, Bob Vandette wrote:
> > > Please review this fix for correcting the Heap size when
> > > MaxRAMPercentage is used.
> > > BUG:
> > > https://bugs.openjdk.java.net/browse/JDK-8213175
> > > WEBREV:
> > > http://cr.openjdk.java.net/~bobv/8213175/webrev.0/
> > > SUMMARY:
> > > This fix changes the ordering of when we try to determine if we
> > > can enable UseCompressedOops.  It is now done after more of the
> > > Heap size calculations are completed.
> > > The only behavioral change should be that UseCompressedOops will
> > > now be disabled if someone specifies a MaxRAMPercentage that
> > > causes the heap size to exceed the amount of memory that can be
> > > used with compressed oops.
> > > Prior to this change, the amount of heap was reduced to the
> > > compressed oop limit.
> > 
> > It isn't at all clear to me that this is the correct fix here. Why
> > not assess this as "not an issue" and require the user to disable
> > compressed oops?
> The reason that I didn’t take that path is that I believe the new
> behavior was the intention of the original author.

It would be really useful for this discussion if the old and/or the
"intended behavior" were explained shortly, and the differences
outlined (with pros/cons), and what happens when which flags are set.

At least to determine whether the new behavior is better and correctly
implemented, the new behavior should be explained somewhere outside of
the actual code change.

The (new) behavior must be verified using appropriate tests. A gigantic
heap to trigger the code could be faked by using MaxRAM, and current
CompressedOops mode can be determined eg. from -Xlog:gc+heap+coops
output, so tests creating tests or extending existing ones (there are
already some compressed oops jtreg tests afair) and should be fairly

The reproducer-hard label in the CR should probably be removed.

> The comment at line 1614 states that InitialHeapSize was the only
> option that could impact MaxHeapSize
> but the author was wrong.  The HeapRatio and now the HeapPercentage
> options impact MaxHeapSize as well and weren’t taken into
> consideration.
> 1611 void Arguments::set_use_compressed_oops() {
> 1612 #ifndef ZERO
> 1613 #ifdef _LP64
> 1614   // MaxHeapSize is not set up properly at this point, but
> 1615   // the only value that can override MaxHeapSize if we are
> 1616   // to use UseCompressedOops is InitialHeapSize.
> 1617   size_t max_heap_size = MAX2(MaxHeapSize, InitialHeapSize);
> 1618 
> > 
> > Such a behaviour change needs to be discussed and needs a CSR
> > request.
> Let’s see what the GC team wants me to do here.  

This is a significant externally visible change to behavior, so this
needs a CSR regardless of pre-existing intended behavior or not.

The HeapPercentage CSR (JDK-8186315) does not indicate changes to the
heap sizing ergonomics or compressed oop mode selection at all, so it
does not cover this change.

With this change, on sufficiently large machines we now seem to select
non-coop mode as default:

E.g. old version:

$ java -Xlog:gc+heap+coops=trace -XX:MaxRAM=128g -version
[0.007s][trace][gc,heap,coops] Trying to allocate at address
0x0000000082000000 heap of size 0x77e000000
[0.018s][info ][gc,heap,coops] Heap address: 0x0000000082000000, size:
30688 MB, Compressed Oops mode: Zero based, Oop shift amount: 3
openjdk version "10.0.2" 2018-07-17

(I do not think coops selection changed significantly since that
version, but that's what I had at hand otherwise)


$ java -XX:MaxRAM=128g -Xlog:gc+heap+coops=trace -version
openjdk version "13-internal" 2019-09-17

I.e. no coops mode selected.

This RFR does not give performance numbers. Did you check for
performance regressions in our benchmark suites?

If not, please do so, since we run many benchmarks without max heap
size specification internally on machines with lots of memory (testing
out-of-the-box behavior), this will likely cause differences everywhere
(not to mention regressions) that need to be communicated and explained
to the responsible persons preferably in advance.

(Please keep me in the loop, as I am also somewhat interested in this
as I am the person that typically gets asked about this if the numbers
went south....)


More information about the hotspot-runtime-dev mailing list