RFR: 8203359: Container level resources events [v8]

Erik Gahlin egahlin at openjdk.java.net
Wed Apr 14 11:24:12 UTC 2021

On Wed, 14 Apr 2021 08:31:55 GMT, Jaroslav Bachorik <jbachorik at openjdk.org> wrote:

>> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
>> * CPU related metrics
>> * Memory related metrics
>> * I/O related metrics
>> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval. 
>> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).
> Jaroslav Bachorik has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision:
>  - Roll back conditional registration of container events
>  - Remove container events flag
>  - Remove trailing spaces
>  - Doh
>  - Report container type and register events conditionally
>  - Remove unused test files
>  - Initial test support for JFR container events
>  - Update the JFR control files
>  - Split off the CPU throttling metrics
>  - Formatting spaces
>  - ... and 1 more: https://git.openjdk.java.net/jdk/compare/1f93be70...67a61bd7

src/jdk.jfr/share/classes/jdk/jfr/events/ContainerCPUThrottlingEvent.java line 44:

> 42: @Category({"Operating System", "Container", "Processor"})
> 43: @Description("Container CPU throttling related information.")
> 44: public class ContainerCPUThrottlingEvent extends AbstractJDKEvent {

I wonder if we should put all the container events in the same category {"Operating System", "Container"}, Or possibly add them under the already existing categories under "Operating System"?

src/jdk.jfr/share/classes/jdk/jfr/events/ContainerCPUThrottlingEvent.java line 46:

> 44: public class ContainerCPUThrottlingEvent extends AbstractJDKEvent {
> 45:   @Label("CPU Elapsed Slices")
> 46:   @Description("Number of time-slice periods that have elapsed if a CPU quota has been setup for the container.")

If the description is one sentence, period should not be included.

src/jdk.jfr/share/classes/jdk/jfr/events/ContainerCPUUsageEvent.java line 46:

> 44: public class ContainerCPUUsageEvent extends AbstractJDKEvent {
> 45:   @Label("CPU Time")
> 46:   @Description("Aggregate time, in nanoseconds, consumed by all tasks in the container.")

We usually skip the unit "nanoseconds" in descriptions when the field has a content type that describes the unit.

src/jdk.jfr/share/classes/jdk/jfr/events/ContainerConfigurationEvent.java line 45:

> 43: @Description("A set of container specific attributes.")
> 44: public final class ContainerConfigurationEvent extends AbstractJDKEvent {
> 45:     @Label("Container type")

Capitalize "type" in the label

src/jdk.jfr/share/classes/jdk/jfr/events/ContainerConfigurationEvent.java line 78:

> 76: 
> 77:     @Label("Memory and Swap Limit")
> 78:     @Description("Maximum amount of physical memory and swap space, in bytes, that can be allocated in the container.")

No need to mention bytes in the description when the field has DataAmount annotation.

src/jdk.jfr/share/classes/jdk/jfr/events/ContainerIOUsageEvent.java line 47:

> 45: public class ContainerIOUsageEvent extends AbstractJDKEvent {
> 46: 
> 47:   @Label("BlkIO Request Count")

Change to "Block IO"

src/jdk.jfr/share/classes/jdk/jfr/events/ContainerMemoryUsageEvent.java line 46:

> 44: public final class ContainerMemoryUsageEvent extends AbstractJDKEvent {
> 45:     @Label("Memory Pressure")
> 46:     @Description("(attempts per second * 1000), if enabled, that the operating system tries to satisfy a memory request for any " +

This unit seems a bit strange. Do we really need to multiply by 1000?


PR: https://git.openjdk.java.net/jdk/pull/3126

More information about the core-libs-dev mailing list