8020860: cluster Hashtable/Vector field updates for better transactional memory behaviour
mike.duigou at oracle.com
Thu Apr 3 23:42:23 UTC 2014
On Mar 25 2014, at 21:21 , David Holmes <David.Holmes at oracle.com> wrote:
> On 26/03/2014 6:37 AM, Mike Duigou wrote:
>> Hello all;
>> Recently HotSpot gained additional support for transactional memory, <https://bugs.openjdk.java.net/browse/JDK-8031320>. This patch is a libraries followon to that change. RTM and other transactional memory implementations benefit from clustering writes towards the end of the transaction whenever possible. This change optimizes the behaviour of two collection classes, Hashtable and Vector by moving several field updates to cluster them together near the end of the transaction. Yes, we know, these are obsolete collections but these two classes were used as the basis for the original benchmarking and evaluation during the development of the transactional memory JVM features. Future changes providing similar optimizations to other collections will be pursued when it can be shown they offer value and don't add a cost to non TM performance (TM is not yet a mainstream feature).
>> It is not expected that this change will have any meaningful impact upon performance (either positive or negative) outside of TM-enabled configurations. The main change is to move existing field updates towards the end of the transaction and avoid conditionals between field updates.
>> There is a slight behaviour change introduced in this changeset. Previously some methods updated the modcount unconditionally updated even when an ArrayIndexOutOfBoundsException was subsequently thrown for an invalid index and the Vector was not modified. With this change the modcount will only be updated if the Vector is actually changed. It is not expected that applications will have relied or should have relied on this behaviour.
> I could live with that change in behaviour, but this change completely breaks the fail-fast semantics of the iterators in some cases! If you don't update modCount until after the change is complete, the iterator may access the updated state and not throw CME!.
For Vector I don't see this. The Iterator accesses to the data structures is always done with the Vector.this lock held. The re-ordering would only be observable to another thread if it is reading the Vector fields without holding the lock. I am not sure we should worry about that case.
For Hashtable Iterator there is no synchronization on the owning Hashtable except during the remove() method. It is unclear why the Hashtable iterators were not written in the same way as Vector. It seems like there would be massive disruption to adding synchronization to Hashtable's itertors. Are the Hashtable iterators actually fast-fail? Without synchronization this is not guaranteed since the writes may not be visible and Hashtable iterator failure behaviour is already likely to vary between platforms/architectures. With RTM it's presumed that the writes will NOT be visible until the transaction completes. This implies that the failure mode from Hashtable iterators is likely to change just by turning RTM locking on whether we make this code change or not. :-(
> I think this change is misguided.
I think we are fine for Vector, but Hashtable gives me concerns even in it's current state.
More information about the core-libs-dev