Call for Discussion: New Project: CRaC
akozlov at azul.com
Fri Jul 23 07:56:24 UTC 2021
On 7/22/21 11:51 PM, Michael Bien wrote:
> On 22.07.21 21:17, Anton Kozlov wrote:
>>> - How to make the JVM/JDK behave gracefully after "time-jumps".
>> I assume there should be no correctness problems, as the time-jump does not
>> substantially differ from a time spent off-CPU due to OS scheduling. Some
>> internal counters could overflow, but this does not look more than just a bug
>> that needs fixing.
> this might certainly cause some interesting issues, e.g GC ergonomics getting confused after thinking the last pause lasted 5 days :)
Heuristics may suffer, agree. Not sure if this is the case now (in CRaC),
since the time spent in checkpoint should not be attributed to the GC pause.
But this or similar issues are possible. Testing will be required with the
possible tuning after.
> if we are thinking of the same bug, this was fixed in linux 5.10 (https://lkml.org/lkml/2020/10/15/582 ) - possibly also backported. After 5.10 I never encountered 100% load after restoring JVMs again.
This looks very relevant. I didn't dig down to the root of the problem, but
the description is very close. Thanks, good to know.
More information about the discuss