RFR: JDK-8211727: Adjust default concurrency settings for running tests on Sparc

David Holmes david.holmes at oracle.com
Wed Nov 14 22:40:19 UTC 2018


Thanks for the further info Erik. I guess we will see how this plays out 
over time.

Thanks,
David

On 15/11/2018 3:05 am, Erik Joelsson wrote:
> On 2018-11-13 20:03, David Holmes wrote:
>> Hi Erik,
>>
>> Thanks for all the work you did in trying to stabilize this.
>>
>> One comment ...
>>
>> On 14/11/2018 7:34 am, Erik Joelsson wrote:
>>> This patch changes the formula for default test concurrency in 
>>> RunTest.gmk. The current formula is:
>>>
>>> min(cpus/2, 12)
>>>
>>> This seems to work well enough on the x64 machines we currently run 
>>> our tests on, but less so for Sparc. I have now run rather extensive 
>>> testing in our lab and have come up with a new formula that provides 
>>> much better test reliability while preserving as much test throughput 
>>> as possible. The new formula is cpus/4 for sparcs with up to 16 cpus 
>>> and cpus/5 for larger machines. For non Sparc it's still cpus/2 and 
>>> I've removed the cap for all.
>>
>> I'm surprised that you removed the cap and that it is okay. IIRC we 
>> had problems with large #CPU machines but only medium amounts of RAM. 
>> Too high a concurrency level would result in memory exhaustion.
>>
> Dan brought this up too in chat when I first suggested it. I looked 
> through all available machines in Mach5. There are 2 non SPARC that have 
> enough CPUs to be affected by the cap and they have 256GB of RAM, which 
> is plenty. The rest are SPARC and they have at least 1GB of RAM per CPU, 
> usually more. So with the proposed scheme, I can't see anything really 
> changing with regards to JOBS vs RAM. My testing did not reveal any such 
> problems when I scaled down concurrency enough. Also note that the 
> biggest SPARC we have has 64 CPUs which will now translate into 13 jobs, 
> which is just 1 more than the previous cap.
> 
> We do have a separate issue with a few macs with low RAM compared to 
> CPUS (4GB and 8CPUs) and I intend to attack that next. My plan is 
> basically to do something similar to what configure is doing for build 
> jobs (which is JOBS=min(cpus, RAM in GB)). The exact formula to be 
> determined. I suspect it's going to involve a constant for the RAM part 
> to make room for the test harness so something like (RAM - k)/x.
> 
> /Erik
> 
>> Thanks,
>> David
>>
>>> In addition to this, since Sparc generally have lower per thread 
>>> performance, at least when running JDK tests, I have bumped the 
>>> default timeout factor from 4 to 8 for Sparc.
>>>
>>> With these defaults, we were able to remove a lot of special cases 
>>> for Sparc in other parts of our configurations and I was able to get 
>>> clean runs of all the lower tiers of testing, on each of our machine 
>>> classes in the lab.
>>>
>>> In addition to this, the test 
>>> compiler/jsr292/ContinuousCallSiteTargetChange.java, which had its 
>>> timeout increased in JDK-8212028, no longer needs an increased 
>>> timeout with the new defaults.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8211727
>>>
>>> Webrev: http://cr.openjdk.java.net/~erikj/8211727/webrev.01/
>>>
>>> /Erik
>>>


More information about the build-dev mailing list