# New portion of improvements for Dual-Pivot Quicksort

Wed May 5 04:57:42 UTC 2010

```Vladimir,

Hi. I'm thrilled that you were able to eke so much more perforance out of
Here's the code review:

I think the comment on lines 102-105 was too useful to delete:

102        ...  This
103      * method differs from the public {@code sort} method in that the
104      * {@code right} index is inclusive, and it does no range checking
105      * on {@code left} or {@code right}.

The sentinel technique that you use in lines 118 - 136 is questionable: you
are modifying a portion of the array outside the specified range of the
call, which arguably violates the contract of the call, and could be
observed in a multithreaded program. It's not beyond the realm of reason
that it could break existing clients. I will discuss it with Doug Lea and
let you know what he says.

If we leave this optimization in, we should change the comments slightly.
Appearing to comment-out code (other than entire assertions in inner loops)
is never a good thing, hence the change to line 130 and the addition of the
line that follows. Also I reworded the commennt in lines 122-125 for
clarity. Changed lines are starred, added line has a "+":

118         // Use insertion sort on tiny arrays
119         if (length < INSERTION_SORT_THRESHOLD) {
120             if (left > 0) {
121                 /*
*122                  * Temporarily set the value immediately preceding the
range
*123                  * to the minimum int value to serve as a sentinel.
This
*124                  * allows us to avoid the j >= left check on each
iteration.
125                  */
126                 int before = a[left - 1]; a[left - 1] =
Integer.MIN_VALUE;
127
128                 for (int j, i = left + 1; i <= right; i++) {
129                     int ai = a[i];
*130                     for (j = i - 1; ai < a[j]; j--) {
+NEW                         // assert j >= left;
131                         a[j + 1] = a[j];
132                     }
133                     a[j + 1] = ai;
134                 }
135                 a[left - 1] = before;
136             } else {

The comment in line 155 is misleading, and should be replace by this:

I'd reword the comment in lines 137-140 for clarity:

137                 /*
138                  * For case, when left == 0, traditional (without a
sentinel)
139                  * insertion sort, optimized for server VM, is used.
140                  */

155         // Inexpensive approximation of length / 7

The comment strategy for choosing sample points is subtle enough that it
deserves bit more commentary. I propose replacing line 158 with this:

/*
* Sort five evenly spaced elements around (and including) the center
* element in the range. These elements will be used for pivot selection
* as described below. The choice for spacing these elements was
empirically
* determined to work well on a wide variety of inputs.
*/

Lines 183 and 184 are unnecessary array accesses. Probably better is:

183         int pivot1 = ae2;
184         int pivot2 = ae4;

This is essentially just a renaming of these two local variables, and
presumably the compiler will act accordingly.

I prefer the original wording:

197              * Pointer k is the first index of ?-part

to the revised wording:

217              * Pointer k is the first index of not inspected ?-part.

I'd revert this change.

I'd change 190 from:

190         if (pivot1 < pivot2) {

to

190         if (pivot1 != pivot2) {

It's clearer (this is the "pivots differ" case), and at least as fast.

The spacing lines 249 and 250 is a bit quirky:

249             dualPivotQuicksort(a, left,   less - 2);
250             dualPivotQuicksort(a, great + 2, right);

I'd replace it with the more standard:

249             dualPivotQuicksort(a, left, less - 2);
250             dualPivotQuicksort(a, great + 2, right);

Alternatively, you could make corresponding arguments line up:

249             dualPivotQuicksort(a, left,      less - 2);
250             dualPivotQuicksort(a, great + 2, right   );

The policy change in line 256 is non-trivial:

Old:

298         if (less < e1 && great > e5) {

New:

256             if (great - less > 5 * seventh) {

I previously experimented with the "new-style" policy (length-based, rather
than endpoint-based), but didn't get good results. Perhaps the symmetric
style combines well with the change in interval size (5/7 vs. 2/3). At any
rate, it's simpler than the old-style and conforms to the comment, so if
you're satisfied with the performance, I'm happy with it.

In lines 362 and 363, you have the same "quirky spacing" as in linkes 249
and 250:

361             // Sort left and right parts recursively
362             dualPivotQuicksort(a, left,   less - 1);
363             dualPivotQuicksort(a, great + 1, right);

Regards,

Josh

On Thu, Apr 29, 2010 at 4:05 AM, Vladimir Iaroslavski <

> Hello,
>
> Thank you for pointing to this check, I have run test again and it
> works little it faster
> (this check was necessary for old version of insertion sort):
>
> 60.90% / 48.35% instead of 61.00% /  51.00% (client / server).
>
> I attached new version of the class (+ minor javdoc changes).
>
> Best regards,
>
> On Thu, Apr 29, 2010 at 12:18 PM, Goktug Gokdogan <gokdogan at gmail.com>
> wrote:
> > One quick comment: you can consider moving trivial array check so it will
> be
> > executed only if insertion sort threshold check passes:
> >
> > if (length < INSERTION_SORT_THRESHOLD) {
> >   if (length < 2) {
> >     return;
> >   }
> >   ....
> >
> > On Mon, Apr 26, 2010 at 2:50 PM, Vladimir Iaroslavski
> >>
> >> Hello, everyone!
> >>
> >> I've investigated the implementation of the Dual-Pivot Quicksort
> >> which is used for sorting primitives and here is my result:
> >>
> >> http://cr.openjdk.java.net/~alanb/6947216/webrev.00
> >>
> >> New implementation of Dual-Pivot Quicksort is faster
> >> than previous one of 12% for client VM and few percents for
> >> server VM. Tests on Bentley's test suite (Win XP, JDK 7,
> >> build 1.7.0-ea-b84, n = 1000000) show geometry mean 0.88
> >> for client VM and 0.98-1.00 for server VM.
> >>
> >> In compare with sorting from JDK 6 by Jon L. Bentley and
> >> M. Douglas McIlroy's (with one pivot) the ratio is 0.61 / 0.50
> >> (client / server).
> >>
> >> See the execution time for sorting array of 2`000`000 int elements
> >> 50 times, client / server VM, in milliseconds:
> >>
> >> random
> >>  new: 16723 18776
> >> jdk7: 17571 18975
> >> jdk6: 22241 26524
> >>
> >> ascendant
> >>  new:  3541  4702
> >> jdk7:  4486  4669
> >> jdk6:  8667  7978
> >>
> >> descendant
> >>  new:  3854  4907
> >> jdk7:  4942  5034
> >> jdk6:  8787  8243
> >>
> >> equal
> >>  new:   234   281
> >> jdk7:   291   230
> >> jdk6:   602  1018
> >>
> >> organ pipes
> >>  new:  7673  8613
> >> jdk7:  8167  8993
> >> jdk6: 11902 14044
> >>
> >> stagger 1
> >>  new:  7648  8591
> >> jdk7:  8161  8979
> >> jdk6: 11908 13810
> >>
> >> stagger 2
> >>  new:  8349  9299
> >> jdk7: 10968 11916
> >> jdk6: 12194 14734
> >>
> >> stagger 4
> >>  new:  8475  9622
> >> jdk7:  9221  9682
> >> jdk6: 10523 12006
> >>
> >> stagger 8
> >>  new:  9321 10689
> >> jdk7: 11125 12387
> >> jdk6: 13829 16214
> >>
> >> period 1..2
> >>  new:   758   751
> >> jdk7:   870   754
> >> jdk6:  1038  1227
> >>
> >> period 1..4
> >>  new:  1004   963
> >> jdk7:  1365  1209
> >> jdk6:  1511  1847
> >>
> >> period 1..8
> >>  new:  1588  1573
> >> jdk7:  1599  1790
> >> jdk6:  2602  3045
> >>
> >> random 1..2
> >>  new:  1327  1125
> >> jdk7:  1362  1496
> >> jdk6:  1531  2182
> >>
> >> random 1..4
> >>  new:  1830  2118
> >> jdk7:  1851  2236
> >> jdk6:  2292  3025
> >>
> >> where stagger(m) is array like a[i] = i * (m + 1) % length.
> >>
> >> The summary of changes is:
> >>
> >> 1. For sorting small arrays is used insertion sort with sentinel
> >>
> >> for (int i = left + 1; i <= right; i++) {
> >>    for (j = i; j > left && a[j-1] > a[j]; j--) {
> >>        swap(a[i], a[j-1]);
> >>    }
> >> }
> >>
> >> Note that range check j > left is performed on each iteration,
> >> but really helps very rare. To avoid this expensive range check,
> >> it was suggested to set minimum value (negative infinity) on the
> >> first position. This type of suggestion is used in new version:
> >>
> >> if left bound > 0, we can put sentinel on a[left - 1], do insertion
> >> sort without expensive check range, and then restore a[left - 1]
> >> value. If left == 0, traditional insertion sort is used. Please,
> >> look at the webrev for details.
> >>
> >> 2. In previous implementation 5 evenly spaced elements
> >>
> >> sixth = length / 6;
> >> a[sixth], a[2 * sixth], a[3 * sixth], a[4 * sixth], a[5 * sixth]
> >>
> >> were used as candidates of pivots elements. This case is very
> >> sensitive for period inputs, especially by 6. The new suggestion
> >> is to take 5 center evenly spaced elements like this:
> >>
> >> int seventh = length / 7;
> >> int midpoint = (left + right) >>> 1;
> >>
> >> a[midpoint - 2 * seventh], a[midpoint -     seventh], a[midpoint],
> >> a[midpoint +     seventh], a[midpoint + 2 * seventh]
> >>
> >> and moreover, the seventh is calculated inexpensively:
> >> seventh = (length >>> 3) + (length >>> 6) + 1;
> >>
> >> This schema works the same on random, ascendant, descendant, equal
> >> inputs, but much better for period / stagger.
> >>
> >> 3. The whole structure
> >>
> >> <choose pivots>
> >>
> >> if (pivotsDiffer) {
> >>    <do partitioning for two pivots>
> >> }
> >> else {
> >>    <do partitioning for one pivot>
> >> }
> >>
> >> <sort left and right parts>
> >>
> >> if (!pivotsDiffer) {
> >>    return;
> >> }
> >> <swap internal pivot values to ends>
> >>
> >> <sort center part>
> >>
> >> was modified to:
> >> ----------------
> >>
> >> <choose pivots>
> >>
> >> if (pivot1 < pivot2) {
> >>    <do partitioning for two pivots>
> >>    <swap pivots into their final positions>
> >>    <sort left and right parts>
> >>    <swap internal pivot values to ends>
> >>    <sort center part>
> >> }
> >> else {
> >>    <do partitioning for one pivot>
> >>    <sort left and right parts>
> >> }
> >>
> >> 4. Partitioning for both cases have not been changed at all.
> >>
> >> 5. Minor javadoc and format changes.
> >>
> >> Please, review new implementation,
> >> any comments / suggestions are welcome!
> >>
> >> Thank you,