enhancement of cmpxchg and copy_to_survivor for ppc64

Hiroshi H Horii HORII at jp.ibm.com
Fri Apr 8 10:53:48 UTC 2016

Dear all:

Can I please request reviews for the following change?
This change was created for JDK 9 and ppc64.

This change adds options of compare-and-exchange for POWER architecture.
As described in atomic_linux_ppc.inline.hpp, the current implementation of
cmpxchg is fence_cmpxchg_acquire. This implementation is useful for
general purposes because twice calls of sync before and after cmpxchg will
keep consistency. However, they sometimes cause overheads because
sync instructions are very expensive in the current POWER chip design.
With this change, callers can explicitly specify to run fence and acquire 
two additional bool parameters. Because their default values are "true",
it is not necessary to modify existing cmpxchg calls. 

In addition, with the new parameters of cmpxchg, this change improves
performance of copy_to_survivor in the parallel GC. 
copy_to_survivor changes forward pointers by using cmpxchg. This 
operation doesn't require any sync instructions, in my understanding. 
A pointer is changed at most once in a GC and when cmpxchg fails, 
the latest pointer is available for the caller.

When I evaluated SPECjbb2013 (slightly customized because obsolete grizzly
doesn't support new version format of Java 9), pause time of young GC was
reduced from 10% to 20%.

Summary of source code changes:

* src/share/vm/runtime/atomic.hpp
* src/share/vm/runtime/atomic.cpp
* src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
       - Add two arguments of fence and acquire to cmpxchg only for PPC64.
         Though cmpxchg in atomic_linux_ppc.inline.hpp has some branches,
         they are reduced while inlining to callers.

* src/share/vm/oops/oop.inline.hpp
      - Changed cas_set_mark to call cmpxchg without fence and acquire.
         cas_set_mark is called only by cas_forward_to that is called only 
         copy_to_survivor_space and oop_promotion_failed in 

Code change:

   Please see an attached diff file that was generated with "hg diff -g" 
   under the latest hotspot directory.

Passed test:
    SPECjbb2013 (customized)

* I believe some other cmpxchg will be optimized by reducing fence 
  or acquire because twice calls of sync are too conservative to implement
  Java memory model.

Hiroshi Horii, Ph.D.
IBM Research - Tokyo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160408/6be02528/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ppc64_cmpxchg_opt.diff
Type: application/octet-stream
Size: 8837 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160408/6be02528/ppc64_cmpxchg_opt-0001.diff>

More information about the hotspot-compiler-dev mailing list