RFR: 8185525: [Event Request] Add Tracing event for DictionarySizes
gerard.ziemski at oracle.com
Fri Apr 12 20:13:22 UTC 2019
On 4/12/19 12:45 PM, Erik Gahlin wrote:
> On 2019-04-10 22:03, gerard ziemski wrote:
>> On 4/10/19 1:12 PM, coleen.phillimore at oracle.com wrote:
>>>>> I noticed that events are only emitted if we are able to take the
>>>>> resize lock. Can this be fixed? What prevents us from always
>>>>> getting the data? That's how other periodic events work and losing
>>>>> data sometimes may lead to subtle bugs that hard to understand and
>>>>> replicate in systems that rely on the information. Could we retry
>>>>> on a failure?
>>>> Good observation. If the resize lock is taken, then it's not likely
>>>> that whoever owns it will be done soon, so retrying is most likely
>>>> not going to succeed right away. Is it OK to tie up JFR periodic
>>>> thread for some time? If so, how long?
> There is no general upper limit for periodic events.
> If we need to wait for a safepoint, we need to do it. That said,
> events that can induce significant latencies or CPU overhead (even in
> pathological cases) are off in default.jfc and only enabled in
> profile.jfr, or not at all.
> As I understand it, the events themselves don't cause latencies and
> the tables are not expanded that often, so I think it would be okay to
> emit them. If you think otherwise, I would try to scan concurrently,
> even if it means we are slightly off.
>>>> If the lock is taken, then it means that someone is scanning
>>>> through the entire table, or the table is being resized. Either
>>>> way, we're not loosing data, but are just temporarily blind - I
>>>> don't see a problem here for a long running apps, they will start
>>>> receiving events eventually (which happen every 10 sec by default)
> A user can set period "everyChunk" which means events are guaranteed
> to be in the recording.
> I think we should try to avoid breaking that contract. When event
> streaming is in place, we can implement requestable events where a
> user can demand an event programmatically from Java. If they sometimes
> don't get an event, it will break their code in a subtle way.
No problem, I removed the resize_lock around the JFR table statistics,
so we might get a slightly incorrect stats every now and then, but we
will be emitting the events on schedule:
Last question: what is the recommended way to programatically tell if
JFR is ON? I'm wondering whether I should collect the add/remove rates
for the tables only if JRF is ON. As it is right now, we collect them
always. It's just an atomic increment, but still, it's work only JFR
More information about the jmc-dev