RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

Martin Buchholz martinrb at google.com
Tue Apr 2 05:24:28 UTC 2013

Overall, this kind of change seems barely worth doing.


This change:

         // Let gc do its work
-        for (int i = 0; i < size; i++)
-            elementData[i] = null;
+        Arrays.fill(elementData, null);

changes the performance of clear from O(size) to O(capacity), which is bad,
and unrelated to the "empty collection optimization".


+        if(elementData == EMPTY_ELEMENTDATA) {

Please use standard jdk style - space after keywords such as "if".


 183     public void ensureCapacity(int minCapacity) {
 184         if (minCapacity > 0)
 185             ensureExplicitCapacity(minCapacity);
 186     }
 188     private void ensureCapacityInternal(int minCapacity) {
 189         if(elementData == EMPTY_ELEMENTDATA) {
 190             minCapacity = Math.max(10, minCapacity);
 191         }
 193         ensureExplicitCapacity(minCapacity);
 194     }

It looks to me like, if the user does

x = new ArrayList();

then the capacity will be 2, not 10, an observable change from the previous
implementation, and sort-of contradicting the spec of ArrayList()


 604         Arrays.fill(elementData, newSize, size, null);

In performance-critical code I would avoid Arrays.fill because it adds a
bit of overhead (unless it's intrinsified, which I don't think it is).


If we are willing to make small incompatible changes, how about when
serializing, storing capacity as the size, so that reconstituted ArrayLists
are pre-trimmed to size?


I still believe that with the help of the gc team, one can cook up a
trim-to-size when gc-ing fix, but that's going to be very hard to implement.


On Mon, Apr 1, 2013 at 7:00 PM, Mike Duigou <mike.duigou at oracle.com> wrote:

> Hello all;
> I have posted an updated version of the empty ArrayList and HashMap patch.
> http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/
> This revised implementation introduces *no new fields* to either class.
> For ArrayList the lazy allocation of the backing array occurs only if the
> list is created at default size. According to our performance analysis
> team, approximately 85% of ArrayList instances are created at default size
> so this optimization will be valid for an overwhelming majority of cases.
> For HashMap, creative use is made of the threshold field to track the
> requested initial size until the bucket array is needed. On the read side
> the empty map case is tested with isEmpty(). On the write size a comparison
> of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket
> array. In readObject there's a little more work to try to choose an
> efficient initial capacity.
> Mike
> On Mar 26 2013, at 17:25 , Mike Duigou wrote:
> > Hello all;
> >
> > This is a review for optimization work that came out of internal
> analysis of Oracle's Java applications. It's based upon analysis that shows
> that in large applications as much as 10% of maps and lists are initialized
> but never receive any entries. A smaller number spend a large proportion of
> their lifetime empty. We've found similar results across other workloads as
> well. This patch is not a substitute for pre-sizing your collections and
> maps--doing so will *always* have better results.
> >
> > This patch extends HashMap and ArrayList to provide special handling for
> newly created instances that avoids creating the backing array until
> needed. There is a very small additional cost for detecting when to inflate
> the map or list that is measurable in interpreted tests but disappears in
> JITed code.
> >
> > http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
> >
> > We expect that should this code prove successful in Java 8 it will be
> backported to Java 7 updates.
> >
> > The unit test may appear to be somewhat unrelated. It was created after
> resolving a bug in an early version of this patch to detect the issue
> encountered (LinkedHashMap.init() was not being called in readObject() when
> the map was empty).
> >
> > Mike

More information about the core-libs-dev mailing list