[PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy

Bob Vandette bob.vandette at oracle.com
Mon Dec 2 19:26:14 UTC 2019

> On Dec 2, 2019, at 2:03 PM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
> Hi Bob,
> On Mon, 2019-12-02 at 12:13 -0500, Bob Vandette wrote:
>> Sorry for the delay in responding.  See comments below:
> Thanks very much for taking a look at this!
>>> On Nov 29, 2019, at 4:05 AM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
>>> On Fri, 2019-11-15 at 17:51 +0100, Severin Gehwolf wrote:
>>>> Hi,
>>>> Could I please get reviews of these core-libs, Linux-only, changes to
>>>> the Metrics subsystem? This patch implements cgroupv2 support for
>>>> Metrics which are currently cgroupv1-only. Fedora 31 switched to
>>>> cgroupv2 by default so it's time to get OpenJDK recognize it.
>>>> Note that a couple of metrics are no longer supported with cgroupv2.
>>>> Most notably (not an exhaustive list, though):
>>>> Metrics.getKernel*() family of methods.
>>>> Metrics.getTcp*() family of methods.
>>>> Metrics.getBlkIO*() family of methods.
>>>> Metrics.isMemoryOOMKillEnabled()
>>>> A couple of open questions with regards to that:
>>>> 1)
>>>> Most API docs of Metrics make no distiction between "unlimited" and
>>>> "not supported", both returning -1 for longs, for example. This is a
>>>> problem, because output of "java -XshowSettings:system -version" will
>>>> not distinguish between unlimited and not supported metrics. Would it
>>>> be acceptable to change the API to distinguish those cases so that
>>>> LauncherHelper could display them appropriately?
>>>> 2)
>>>> How should we deal with "not supported" for booleans/arrays, etc.?
>>>> Would it make sense to return record objects from the Metrics API so
>>>> that this could be dealt with? E.g.
>>>> Metrics m = ...
>>>> MetricResult<int[]> result = m.getCpuSetCpus();
>>>> switch(result.getType()) {
>>>>  case NOT_SUPPORTED: /* do something */; break;
>>>>  case SUPPORTED: int[] val = result.get(); break;
>>>>  ...         
>>>> }
>>>> I'm bringing this up, because the proposed patch doesn't deal with the
>>>> above open questions as of yet. With that being said, it's mostly
>>>> identical to the proposed hotspot changes in [1].
>> For issue 1 and 2 …
>> I wonder if we should change the API to throw UnsupportedOperationException for those methods 
>> that are not available.  This exception is used quite a lot in the java/nio and java/net packages
>> for dealing with functionality not available on a host platform.
>> As an alternate suggestion, we already return -2 for "not supported" for most APIs in the hotspot
>> implementation.  We could use this for non boolean values in the Metrics API.  For boolean
>> values, we could change the API to return “Boolean”.  A null return would signify not
>> implemented.
> I'd be fine with either way. We just need to pick one. Should I go
> ahead with UnsupportedOperationException?

The change to use -2 and Boolean is less intrusive and more consistent with the hotspot
implementation, so I’d pick that one.

>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231111
>>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.html
>> Should we check to make sure that there are no mixed cgroupv1 and cgroupv2 mounted subsystems since
>> you are not supporting mixing.
> I'm not sure what you mean. The kernel won't allow for controllers to
> be used by cgroupv1 *and* cgroupv2 at the same time. Each controller
> would have to be assigned to a control hierarchy (v1 or v2)[1]. If you
> mean hybrid, that's basically cgroups v1. The factory will look at the
> system and detect the version in use: cgroups v1 xor cgroups v2.

The memory subsystem for example can’t be assigned to both v1 and v2 controllers but the
memory subsystem can be assigned to v1 and the cpu subsystem can be assigned to
v2, right?

That’s the case I’m concerned about. 

>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html
>> It looks looks like there may be alternate ways of reporting some of the metrics that you have classified as RETVAL_NOT_SUPPORTED.
>> BlkIOServicedCount (io.stat)
>> KernelMemory (memory.stat)
I was thinking of memory.stat (file+kernel_stack+slab) assuming these
aren’t account for in user memory totals.

>> TcpMemory (memory.stat)
I was thinking of memory.stat (sock)
> Could you be more specific, please? What exaclty do you have in mind
> for KernelMemory from memory.stat? Same for TcpMemory.
> For BlkIOServicedCount do you mean calculating the sum of all rios+wios
> for any device? Something else?
Yes, that’s what I had in mind but since it’s a single value for all I/O, you’d have to
sum up all devices.

>> You could try the same trick for getMemoryAndSwapMaxUsage which keeps track of the highest returned value.
> Sure, will do if you insist. I'll note that this is a hack and if
> getMemoryUsage() is never called, it will be the same as
> getMemoryUsage() - provided I'll add a call to getMemoryUsage() from
> getMemoryMaxUsage(). Either way, the value returned is most likely
> wrong and I'm not a huge fan of it, TBH.

I agree it’s a hack so let’s just make both unsupported.

>> for the benefit of other reviewers, you should provide links to the cgroupv1 and v2 docs.
>> https://www.kernel.org/doc/Documentation/cgroup-v2.txt
>>>> Testing: jdk/submit and platform docker tests on Linux x86_64 (with
>>>> hybrid hierarchy, docker/podman) and on Linux x86_64 with unified
>>>> hierarchy (podman only).
>>>> Thoughts? Suggestions?
>> Do you think we should check the docker version being used for the tests to make sure it
>> supports cgroupv2?  I assume a fairly recent version is required to work with cgroupv2.
> I don't think that will be needed. The container engine won't start if
> cgroups v2 isn't supported. So we can be sure that if docker is
> running, it's either going to be on cgroups v2 or cgroups v1. In both
> cases the right set of tests will run. Does that make sense?

I’m just thinking of a situation where a kernel is configured to use cgroupv2
and a developer wants to run the jtreg tests.  It would be good to skip the
container tests if docker can’t run on it.  We have a check to see if docker can
run before running docker tests “using docker ps" but AFAIK, it doesn’t test starting
a container or verify the version.  


> Thanks,
> Severin
>> Bob.
> [1] https://lists.freedesktop.org/archives/systemd-devel/2017-November/039754.html
>>> Ping?
>>>> Thanks,
>>>> Severin
>>>> [1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039909.html

More information about the core-libs-dev mailing list