RFR: JDK-8078904 : CMS: Assert failed: Ctl pt invariant
eric.caspole at oracle.com
Wed Jul 29 22:12:54 UTC 2015
After a long time I finally got back to this problem JDK-8078904.
It is an assert in debug builds in CMS where the setup of the survivor
chunk array used for setting up the CMS rescan did not completely scan
all the per-thread plab arrays.
In product builds this would only result in uneven distribution of
parallel work where the last task might get 100x as much region to scan
as the others.
After the fix for 8079555 the problem would happen with different cmd
line options but it was still there.
I think the fix for 8130459 somewhat narrowed the conditions where this
problem would happen by checking the MinTLABSize against the
YoungPLABSize, but it still happens.
My idea here is to stride over the survivor plab arrays where the stride
length is based on the gc thread count, so the whole survivior plab
arrays structure will be scanned no matter what MinTLABSize etc is set
on the cmd line, and the parallel tasks are allocated more evenly sized
chunks of work, without increasing the size of the _survivor_chunk_array.
Previously, very small MinTLABSize would cause so many recorded PLABs
that there was not enough room in the _survivor_chunk_array and so there
would be uneven work in the rescan tasks.
Thanks to Sangheon for a lot of side discussions about PLABs and option
I tested this with the failing example using many combinations of gc
thread counts and many MinTLABSize values from 64 up to 4M. And JPRT.
More information about the hotspot-gc-dev