request for review (M): 6883834: ParNew: assert(!_g->to()->is_in_reserved(obj), "Scanning field twice?") with LargeObjects tests
Y. Srinivas Ramakrishna
y.s.ramakrishna at oracle.com
Fri May 6 14:59:04 PDT 2011
In case you haven't started looking yet, an updated version is in:-
with changes based on the review(s) so far.
If you've already started reviewing, it's fine to
stick to the old one if you want to.
On 5/5/2011 1:23 AM, Y. Srinivas Ramakrishna wrote:
> 6883834: ParNew: assert(!_g->to()->is_in_reserved(obj),"Scanning field twice?") with LargeObjects tests
> Investigation of this crash led to several avenues
> towards fixing the problem, including a
> more scalable and performant rewrite of the parallelization
> of card scanning which would have fixed the bug as well,
> and which was indeed the direction in which I had been
> initially headed. However, I realized last week that,
> given the upcoming hs21 code freeze, and the subtleties
> in the current code, that might not be the most sensible
> thing to do at this time. At that point, I decided to fix
> only the correctness problem(s) indicated by the assertion
> that we hit and not address in any manner the relative
> lack of scalability in the current implementation in
> the presence of large non-array objects. Somewhat
> offsetting the lack of that scalability is, however,
> the fact that the situation in which scalability suffers,
> namely very large sparsely dirty non-array objects are the exception
> in most programs (not so, very large sparsely dirty
> array objects which are by far the more common).
> The problem in the current code turned out to be
> (1) it may scan too little and miss some pointers:
> when a large non-array object straddles entirely over a
> complete "chunk" (the parallel unit of work) of the card table,
> its head is dirty (so we are forced to assume an imprecise
> card mark and scan it in its entirety), then
> we may miss any part of it that extends past the end
> of its second chunk. Currently chunks are fixed at
> 256 cards == 128K, so any object larger than that
> size is potentially vulnerable.
> (2) it may scan too much and thus scan some pointers twice:
> when a large object straddles onto a subsequent chunk
> its head is dirty, and the next card is not dirty,
> the intention of the code was that
> the thread scanning the chunk in which the head of the
> object lay would scan the object in its entirety
> and the thread scanning the next chunk would not.
> However, there was a flaw in the way card-dirtiness
> was being tested for an object straddling into a
> chunk from the left and how it was being tested for
> an object straddling past the end (or right).
> This could, under the right conditions, result
> in a suffix of the object in the
> second of the two chunks above being scanned twice.
> While it was the second scenario that caused the
> assertion violation, the first could lead to more
> insidious problems with stale pointers left in old
> generation or permanent generation objects.
> The fix for (1) was to not terminate the search for the
> last card to scan to the right not at the end of the
> next chunk, but go right to the end of the object in
> The fix for (2) was to make sure the correct predicate
> was being tested for the card-values in order to ensure
> that the thread for the left chunk did not scan beyond
> the end of a specific point, as exemplified in the
> "LNC" value.
> Along the way, I added a few assertion checks, some verbose logging
> (disabled in all builds but which can be enabled via a source
> level change: I'd like to leave this in for future debuggability
> and code-comprehensibility), elaborated and corrected the documentation
> comments and changed the order of filling the "LNC array"
> (which is useful to and consulted by leftward neighbouring workers)
> and determining the rightward extent of the chunk scan
> past the end of the chunk. I also tightened the set of
> conditions under which we extend our scan to the right,
> which can save some unnecessary card value examination
> in some cases.
> Note that this is a nearly minimal fix to the correctness
> problem in parallel card-scanning with ParNew. A more
> extensive rewrite of the code with an eye to simplifying
> some of the convolutedness of its current form and of
> improving its scalability when there are very large
> objects in the heap (or, as in this case, large constant
> pools in the perm gen), will be the subject of an RFE
> PS: I also pushed some of the closure construction
> closer to its use. This will also make it easier to
> do implement the performance RFE to follow in hs22.
> In the same vein, I changed some defined constants
> related to chunk size and parallel chunk strides
> into global diagnostic flags. I did leave the current
> values unchanged, although it is quite likely that these
> values could be tuned some, an activity that will be
> part of the aforementioned performance RFE.
> A quick performance run indicated no change in performance,
> as one might expect given the rarity of very large
> non-array objects in most applications. More thorough performance
> testing (with reference server refworkload) is in progress.
> Correctness Testing:
> The jck60001 test that tripped the assert; refworkload; JPRT;
> all three with and without heap verification on a variety
> of different platforms and with the chunk size varied for
> good measure.
> thanks for your reviews!
> -- ramki
More information about the hotspot-gc-dev