Why Nothing Matters: The Impact of Zeroing

Xi Yang hiyangxi at gmail.com
Thu Sep 22 20:05:23 PDT 2011

On 23 September 2011 12:54, Xi Yang <hiyangxi at gmail.com> wrote:
> Hi all,
> We publish a paper (
> http://users.cecs.anu.edu.au/~steveb/downloads/pdf/zero-oopsla-2011.pdf
> ) about zeroing initialization of JVM at OOPSLA11. We found that the
> cost of zeroing initialization is very high on modern x86 CMPs. By
> concurrently zeroing the nursery space with non-temporal instructions,
> we improve the performance by 3.2% on average and up to 9.3% on the
> newest sandybridge (i7-2600) machine across 19 benchmarks from DaCapo,
> SPECjvm98, and pjbb2005.
> The speedup is not that significant, however, compared with current
> zeroing approach in HotSpot, the design we proposed is more simple. If
> HotSpot developers are interested in the idea, you can implement it
> within 1 hour.  One hour work leads to 3.2% speedup, not a bad deal,
> right?

Sorry. I posted a old abstract. Here is the new one.

Memory safety defends against inadvertent and malicious misuse of
memory that may compromise program correctness and security. A
critical element of memory safety is zero initialization. The direct
cost of zero initialization is surprisingly high: up to 12.7%, with
average costs ranging from 2.7 to 4.5% on a high performance virtual
machine on IA32 architectures. Zero initialization also incurs
indirect costs due to its memory bandwidth demands and cache
displacement effects. Existing virtual machines either: a) minimize
direct costs by zeroing in large blocks, or b) minimize indirect costs
by zeroing in the allocation sequence, which reduces cache
displacement and bandwidth. This paper evaluates the two widely used
zero initialization designs, showing that they make different
tradeoffs to achieve very similar performance.
    Our analysis inspires three better designs: (1) bulk zeroing with
cache-bypassing (non-temporal) instructions to reduce the direct and
indirect zeroing costs simultaneously, (2) concurrent non-temporal
bulk zeroing that exploits parallel hardware to move work off the
application’s critical path, and (3) adaptive zeroing, which
dynamically chooses between (1) and (2) based on available hardware
parallelism. The new software strategies offer speedups sometimes
greater than the direct overhead, improving total performance by 3% on
average. Our findings invite additional optimizations and
microarchitectural support.


> Here is the paper link and abstract:
> http://users.cecs.anu.edu.au/~steveb/downloads/pdf/zero-oopsla-2011.pdf
> Managed languages use memory safety to defend against inadvertent and
> malicious misuse of memory. Unmanaged native languages are
> increasingly integrating memory safety for the same reasons. A
> critical element of memory safety is initializing new memory before
> the program obtains it. Our experiments show that zero initialization
> is surprisingly expensive in a highly optimized managed runtime — on
> average the direct cost of zeroing is 4% to 6% and up to 50% of total
> application time on a variety of modern processors. Zeroing incurs
> indirect costs as well, which include memory bandwidth consumption and
> cache displacement. Existing virtual machines (VMs) either: a)
> minimize direct costs by zeroing in large blocks, or b) minimize
> indirect costs by integrating zeroing into the allocation sequence to
> reduce cache displacement.
> This paper first describes and evaluates zero initialization costs and
> the two existing design points. Our microarchitectural analysis of
> prior designs inspires two better designs that exploit concurrency and
> non-temporal cache-bypassing instructions to reduce the direct and
> indirect costs simultaneously. We show that the best strategy is to
> adaptively choose between the two new designs based on CPU
> utilization. This approach improves over widely used hot-path zeroing
> by 3% on average and up to 15% on the newest Intel i7-2600 processor,
> without slowing down any of the benchmarks. These results indicate
> that zero initialization is a surprisingly important source of
> overhead in existing VMs and that our new software strategies are
> effective at reducing this overhead. These findings also invite other
> optimizations, including software elision of zeroing and
> microarchitectural support.
> Regards.

More information about the hotspot-dev mailing list