removing donor threads for hsail allocation stability
tom.deneau at amd.com
Wed Aug 6 22:22:34 UTC 2014
Following up on the webrev below, I decided to try removing the whole idea of donor threads. Instead a few fields were added to JavaThread to maintain an array of "special" tlabs which are only used by the hsail gpu and only initialized on threads that invoke gpu kernels.
Can you please review
The java side changes should be pretty straightforward, mostly replacing donorThread with a reference to the tlab itself.
On the C++ side
* most of the hsail-specific changes were in gpu_hsail_Tlab.hpp but should be pretty straightforward
* since the new hsail_gpu_tlabs are not associated in the "usual" way with a thread, there are a few places such as collectedHeap.cpp and in threadLocalAllocBuffer.cpp where we have to process them in addition to just processing the normal thread tlab.
* Again, since the new hsail_gpu_tlabs are not associated in the normal way with a thread, the tlab now contains a pointer to the "owning thread". This is passed in in the initialize() call. The internal call to ThreadLocalAllocBuffer::mythread() uses this owning_thread field rather than the arithmetic offset calculations it had before.
From: Deneau, Tom
Sent: Wednesday, July 16, 2014 5:55 PM
To: graal-dev at openjdk.java.net
Subject: small webrev for hsail allocation stability
I have submitted a small webrev to solve the following problem.
The hsail allocation routines borrow TLABs from donor threads. The design depended on the donor threads not doing any allocation from those TLABs. The donor threads should be blocked on a CyclicBarrier. But we have seen in fairly rare cases things which looked like the donor threads were doing some allocations and interfering with the HSAIL kernel allocations. Maybe this is related to the "spurious wakeup" described in Object.wait javadocs?
Anyway, this webrev zeroes out the donor thread tlabs as it copies the fields out into the TlabInfo structure used by the GPU. (this copy to TlabInfo has always been there, we just didn't zero out the donor thread tlab fields). Now if the donor thread does spuriously wakeup and needs to allocate anything it will get a new tlab or wait. If it gets a new tlab, then in the post-kernel cleanup code, we retire that tlab as we copy fields back in to the donor thread.
Things were definitely more stable with this change.
More information about the graal-dev