Request for reviews (M): 7059037: Use BIS for zeroing on T4
vladimir.kozlov at oracle.com
Fri Aug 26 07:51:28 PDT 2011
Thank you, Tom and Christian for reviews.
On 8/26/11 12:02 AM, Christian Thalinger wrote:
> Looks good. -- Christian
> On Aug 26, 2011, at 3:47 AM, Vladimir Kozlov wrote:
>> Thank you, Tom
>> I updated webrev with your and Christian suggestions:
>> Tom Rodriguez wrote:
>>> src/share/vm/gc_interface/collectedHeap.inline.hpp, src/share/vm/oops/cpCacheKlass.cpp:
>>> Please use an ifdef block instead of the expression form.
>>> You might consider using more sophisticated predicates to statically rule out ClearArrays with constant arguments. Something like:
>>> predicate(!n->in(1)->is_Con() || n->in(1)->find_intrpt_t_con()> BlkZeroingLowLimit)
>>> That would reduce any overhead for large instances that will never benefit from BIS.
>> Done. I thought about that but found that such cases are rare since the expression which calculates count could be complex (because we mostly do partial zeroing) or when object is small with constant count ClearArray is replaced with stores in ideal transformation. But I agree it still may help.
>>> Could we use block instead of blk? Otherwise this looks good.
>>> On Aug 24, 2011, at 5:52 PM, Vladimir Kozlov wrote:
>>>> 7059037: Use BIS for zeroing on T4
>>>> On T4 BIS to the beginning of cache line always zeros it. Use it for zeroing new
>>>> allocated java objects. The main code is in MacroAssembler::bis_zeroing() and is
>>>> used by C2 generated code (ClearArray), runtime (Copy::fill_to_aligned_words())
>>>> and template interpreter (TemplateTable::_new()). New stub zero_aligned_words
>>>> was added to use in runtime.
>>>> BIS is used only for objects bigger than BlkZeroingLowLimit (2Kbyte) since it
>>>> requires membar. 2Hb was selected based on microbenchmark results.
>>>> I also added wrasi(Reg, immI) instruction which I used during development.
>>>> VM_Version::has_mru_blk_init() is replaced with has_blk_zeroing() since original
>>>> was not used.
>>>> Zap new object in CollectedHeap::allocate_from_tlab_slow() instead of zeroing it
>>>> since it will be cleaned later in init_obj().
>>>> Fixed call sites of check_for_bad_heap_word_value() where klass is not
>>>> initialized to avoid the verification failure.
More information about the hotspot-compiler-dev