RFR(s): 8013171: G1: C1 x86_64 barriers use 32-bit accesses to 64-bit PtrQueue::_index

Per Liden per.liden at oracle.com
Thu Apr 23 16:40:04 UTC 2015

Hi Thomas,

> On 23 Apr 2015, at 13:16, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> Hi,
> On Thu, 2015-04-23 at 10:52 +0200, Per Liden wrote:
>> Hi,
>> (This change affects G1, but it's touching code in C1 so I'd like to ask 
>> someone from the compiler team to also reviewed this)
>> Summary: The G1 barriers loads and updates the PrtQueue::_index field. 
>> This field is a size_t but the C1 version of these barriers aren't 
>> 64-bit clean. The bug has more details.
>> In addition I've massaged the code a little bit, so that the 32-bit and 
>> 64-bit sections look more similar (and as a bonus I think we avoid an 
>> extra memory load on 32-bit).
>> Webrev: http://cr.openjdk.java.net/~pliden/8013171/webrev.0/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8013171
>> Testing:
>> * gc-test-suite on both 32 and 64-bit builds (with -XX:+UseG1GC 
>> -XX:+TieredCompilation -XX:TieredStopAtLevel=3 -XX:+VerifyAfterGC)
>> * Passes jprt
> Looks good, with the following caveats which should be decided by
> somebody else if they are important as they are micro-opts:
>  - instead of using cmp to compare against zero in a register, it would
> be better to use the test instruction (e.g. __ testX(tmp, tmp)) as it saves
> a byte of encoding per instruction with the same effect.
>  - post barrier stub: I would prefer if the 64 bit code did not
> push/pop the rdx register to free tmp. There are explicit rscratch1/2
> registers for temporaries available on that platform. At least rscratch1
> (=r8) seems to be used without save/restore in the original code already.
> This would also remove the need for 64 bit code to push/pop any register it
> seems to me.
>  - the original code only pushed/popped rbx when there was need to. Now
> the generated code pushes/pops rdx always.
> In general, the new code is easier to follow (and unifies 32/64 bit code
> paths), but seems slightly worse in execution time to me (without testing,
> just gut feeling). It probably won't matter at the end of the day.

Thanks for looking at the patch!

I don’t think these optimizations will make a difference given the nature of C1, but let’s see if someone has a different opinion.


More information about the hotspot-dev mailing list