RFR (M) 8059461: Refactor IndexSet for better performance (preliminary)
albert.noll at oracle.com
Thu Nov 6 09:36:52 UTC 2014
It would be good to have data from different applications / architectures.
Does anyone know more about the motivation for IndexSet, i.e., why it
was introduced in the first place?
On 11/05/2014 11:58 PM, Vladimir Kozlov wrote:
> I think it is nice cleanup with performance benefits :)
> Aleksey, can you compare average memory consumed by IndexSet before
> and after?
> Why you need initialize_in_resource_arena()? by default BitMap() uses
> resource area:
> BitMap(idx_t size_in_bits, bool in_resource_area = true);
> Make lrg_union() PhaseConservativeCoalesce class's method.
> On 11/5/14 1:26 PM, Aleksey Shipilev wrote:
>> Long story short: current implementation of IndexSet, while smart, is
>> too smart for its own good. Trying to be sparse, it loses locality.
>> Trying to be smart about bit tricks, it loses the native word length.
>> Because of that, sophisticated IndexSet does not yield a desired
>> performance benefit on compilation-heavy workloads like Nashorn.
>> Delegating the work to already existing BitMap both conserves the source
>> code, and brings more performance. (C1 also uses the BitMap adapter like
>> that for ValueMap-s).
>> IndexSet is a major data structure for representing IFG in C2 Regalloc,
>> and that is why improvements in IndexSet translate to faster register
>> allocation, and faster C2 compiles. If you gut the IndexSet internals,
>> and replace it with BitMap, the sample performance runs on Nashorn
>> running Octane suite yield reliable improvements in compilation speed
>> (average: 22.7 Kb/s -> 24.4 Kb/s), explained by the decrease in C2
>> regalloc time (average: 155s -> 132s).
>> These improvements are in line with predicted improvements from a trial
>> experiment of "coarsening" the IndexSet, see the relevant RFE:
>> In other words, we can have a performance-improving change which also
>> removes lots of code. Or, a cleanup change, which also improves
>> performance. Here it is:
>> The patch is mostly proof-of-concept, and not ready for commit. Please
>> let me know what you think about the approach. Code/style/idea
>> suggestions are welcome.
>> Brief summary of changes:
>> - Almost all contents of IndexSet are purged, and delegated to BitMap
>> - IndexSetIterator performance is important, and therefore the lookup
>> table approach from IndexSetIterator was transplanted to new
>> BitMapIterator. We might want to improve BitMap::get_next_one_offset
>> with lookup tables as well, but that will contaminate the current
>> - lrg_union was moved to appropriate place (why was it in IndexSet to
>> begin with?)
>> - some of IndexSet memory management / tracing functions were purged
>> The testing so far was very light:
>> - smoke tests with JPRT
>> - Nashorn/Octane benchmarks
More information about the hotspot-compiler-dev