Fri Jun 8 18:45:47 PDT 2012

I found that I can't use memory_type() because some ideal transformations (in 
mulnode.cpp) may replace load from char[] with short load and reverse. So 
instead of comparing vector types I added new method same_velt_type(n1, n2) 
which compare element size for integer types. Also there is no difference in 
vector instructions for Char and Short types so I removed Char related vector 
nodes and used Short vectors for Char type.

I also added regression tests for boolean and char types.


Vladimir Kozlov wrote:
> I found an other problem. My new regression test TestShortVect failed in 
> the overlapping case (load and store from the same array with offset). 
> Superword code thinks store into short[] array is char store and 
> different from short load from short[] so it allows overlapping. It 
> happened because C2 has only one StoreC node for 2 bytes stores but 
> different LoadS and LoadUS nodes for loads. And these load nodes has 
> different memory type: T_SHORT and T_CHAR.
> The simplest change is to add new StoreS node with memory type T_SHORT. 
> I like this approach and will go with it if nobody object.
> An other solution is to extract element type from address type instead 
> of memory_type() for memory nodes. But it is duplication of parser code.
> Note, memory_type() is called in other C2 places only to determine 
> element size. But in Superword it is used to construct the vectors type 
> table.
Thanks,
> Vladimir
> Vladimir Kozlov wrote:
>> Most changes after latest Tom's review were done in superword.cpp and 
>> added regression tests. I think I nailed down latest issues I had with 
>> superword code. The changes pass all testing I did. Please, review it 
>> (same web link).
Thanks,
>> Vladimir
>> On 4/3/12 10:03 AM, Vladimir Kozlov wrote:
>>> 7119644: Increase superword's vector size up to 256 bits
>>> Increase superword's vector size up to 256-bits for YMM AVX registers 
>>> on x86. Added generation of different vector sizes
>>> for different types of arrays in the same loop. Allow to generate 
>>> small (4 bytes) vectors for loops which were unrolled
>>> small number of iterations.
>>> Add new C2 types for vectors and rework VectorNode implementation. 
>>> Used MachTypeNode as base node for vector mach nodes
>>> to keep vector type.
>>> Moved XMM registers definition and vector instructions into one file 
>>> (have to rename eRegI to rRegI in
>>> Tested with full CTW, NSK, C2 regression tests, JPRT and added new test.
Thanks,
>>> Vladimir

