Why Nothing Matters: The Impact of Zeroing

Jesper Wilhelmsson jesper.wilhelmsson at oracle.com
Fri Sep 23 01:48:09 PDT 2011


This sounds interesting. Thank you for posting the paper here. I will have a 
look at the paper and get back to you once we have had the time to look it 
over and discuss this.

On 09/23/2011 04:54 AM, Xi Yang wrote:
> Hi all,
> We publish a paper (
> http://users.cecs.anu.edu.au/~steveb/downloads/pdf/zero-oopsla-2011.pdf
> ) about zeroing initialization of JVM at OOPSLA11. We found that the
> cost of zeroing initialization is very high on modern x86 CMPs. By
> concurrently zeroing the nursery space with non-temporal instructions,
> we improve the performance by 3.2% on average and up to 9.3% on the
> newest sandybridge (i7-2600) machine across 19 benchmarks from DaCapo,
> SPECjvm98, and pjbb2005.
> The speedup is not that significant, however, compared with current
> zeroing approach in HotSpot, the design we proposed is more simple. If
> HotSpot developers are interested in the idea, you can implement it
> within 1 hour.  One hour work leads to 3.2% speedup, not a bad deal,
> right?
> Here is the paper link and abstract:
> http://users.cecs.anu.edu.au/~steveb/downloads/pdf/zero-oopsla-2011.pdf
> Managed languages use memory safety to defend against inadvertent and
> malicious misuse of memory. Unmanaged native languages are
> increasingly integrating memory safety for the same reasons. A
> critical element of memory safety is initializing new memory before
> the program obtains it. Our experiments show that zero initialization
> is surprisingly expensive in a highly optimized managed runtime — on
> average the direct cost of zeroing is 4% to 6% and up to 50% of total
> application time on a variety of modern processors. Zeroing incurs
> indirect costs as well, which include memory bandwidth consumption and
> cache displacement. Existing virtual machines (VMs) either: a)
> minimize direct costs by zeroing in large blocks, or b) minimize
> indirect costs by integrating zeroing into the allocation sequence to
> reduce cache displacement.
> This paper first describes and evaluates zero initialization costs and
> the two existing design points. Our microarchitectural analysis of
> prior designs inspires two better designs that exploit concurrency and
> non-temporal cache-bypassing instructions to reduce the direct and
> indirect costs simultaneously. We show that the best strategy is to
> adaptively choose between the two new designs based on CPU
> utilization. This approach improves over widely used hot-path zeroing
> by 3% on average and up to 15% on the newest Intel i7-2600 processor,
> without slowing down any of the benchmarks. These results indicate
> that zero initialization is a surprisingly important source of
> overhead in existing VMs and that our new software strategies are
> effective at reducing this overhead. These findings also invite other
> optimizations, including software elision of zeroing and
> microarchitectural support.
> Regards.

More information about the hotspot-dev mailing list