“abort preclean due to time” in Concurrent Mark & Sweep
fancyerii at gmail.com
Tue May 3 10:31:07 UTC 2011
I confronted a strange case. The hotspot jvm was always doing gc
and consumed many cpu resources(from 50% to 300% cpu usage). And when
I turned on gc information. I
found "abort preclean due to time" in the gc logs.
So I googled and found some similar questions in
And http://blogs.sun.com/jonthecollector/entry/did_you_know is
suggested to read.
I read the blog post and can't understand well.
As it says, CMS full gc has follwoing phases:
STW initial mark
"Ok, so here's the punch line for all this. When we're doing the
precleaning we do the sampling of the young generation top for a fixed
amount of time before starting the remark. That fixed amount of time
is CMSMaxAbortablePrecleanTime and its default value is 5 seconds. The
best situation is to have a minor collection happen during the
sampling. When that happens the sampling is done over the entire
region in the young generation from its start to its final top. If a
minor collection is not done during that 5 seconds then the region
below the first sample is 1 chunk and it might be the majority of the
young generation. Such a chunking doesn't spread the work out evenly
to the GC threads so reduces the effective parallelism. " --quoted
from this post.
In my option, Concurrent precleaning is the preparing stage for
remark. It will split the young generation to chunks so remark can do
it parallelly. It expected a young gc in order
to split chunks evenly. If there is no young gc before time
out(CMSMaxAbortablePrecleanTime ), it seems it this gc will fail and
all following phases will be skipped.
So when the system load is light(which means there will be no
minor gc), precleaning will always time out and full gc will always
fail. cpu is waste.
Some suggested enlarge CMSMaxAbortablePrecleanTime. Maybe it can
solve this problem. But CMS collector,not like other collectors that
will perform gc when full. it will
perform gc when space usage is larger than 92%(68% for older version
of hotspot) or jvm feel it should do it. if this value is too large,
it will stop the world longer.
"Based on recent history, the concurrent collector maintains
estimates of the time remaining before the tenured generation will be
exhausted and of the time needed for a concurrent collection cycle.
Based on these dynamic estimates, a concurrent collection cycle will
be started with the aim of completing the collection cycle before the
tenured generation is exhausted. These estimates are padded for
safety, since the concurrent mode failure can be very costly.
A concurrent collection will also start if the occupancy of the
tenured generation exceeds an initiating occupancy, a percentage of
the tenured generation. The default value of this initiating occupancy
threshold is approximately 92%, but the value is subject to change
from release to release. "
Another solution: "There is an option CMSScavengeBeforeRemark
which is off by default. If turned on, it will cause a minor
collection to occur just before the remark. That's good because it
will reduce the remark pause. That's bad because there is a minor
collection pause followed immediately by the remark pause which looks
like 1 big fat pause.l "
My question is that why the collector so stupid that it don't do
it like this. If the system is busy, it works like before. Because
it's busy, minor gc will occur and precleaning will success in the
future. If the system is idling, it can adjust the
CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark on.
More information about the hotspot-gc-dev