RFR (M): 8077144: Concurrent mark initialization takes too long
thomas.schatzl at oracle.com
Mon Mar 14 14:58:34 UTC 2016
just some additional comment on performance:
On Mon, 2016-03-14 at 14:15 +0100, Thomas Schatzl wrote:
> Hi all,
> This proposed solution removes the per-thread additional mark
> and recreates this information from the (complete) prev bitmap in an
> extra concurrent phase after the Remark pause.
> This can be done since the Prev bitmap does not change after Remark
> any more.
> In total, this separation of the tasks is faster (lowers concurrent
> cycle time) than doing this work at once for the following reasons:
> - I did not observe any throughput regresssions with this change:
> actually, throughput of some large applications even increases with
> that change (not taking into account that you could increase heap
> now since not so much is taken up by these additional bitmaps).
The concurrent marking + concurrent liveness data counting (creating
the liveness data) is already shorter than the previous combined
concurrent marking phase.
Also part of the second concurrent phase is already kind of amortized
by the waiting we would otherwise do to keep MMU between the pauses
(instead of waiting, do something).
Apart from that, I think the concurrent liveness data creation could
still be sped up significantly: first, the existing code I re-used is
not particularly well-optimized (there seems to be lots of unnecessary
conversion between bitmap indices and HeapWord*), and finding set bits
in the bitmap is not particularly fast (the BitMap::get_next_one_bit())
particularly on more dense bitmaps.
I left that as cleanup for later to not complicate the change any
further (and there is already JDK-6735527 for the bitmap scan issue).
More information about the hotspot-gc-dev