Replacement of Quicksort in java.util.Arrays with new Dual-Pivot Quicksort
Vladimir Yaroslavskiy
Vladimir.Yaroslavskiy at Sun.COM
Fri Sep 11 10:35:22 UTC 2009
Hello All,
I'd like to share with you new Dual-Pivot Quicksort which is
faster than the known implementations (theoretically and
experimental). I'd like to propose to replace the JDK's
Quicksort implementation by new one.
Description
-----------
The classical Quicksort algorithm uses the following scheme:
1. Pick an element P, called a pivot, from the array.
2. Reorder the array so that all elements, which are less than
the pivot, come before the pivot and all elements greater than
the pivot come after it (equal values can go either way). After
this partitioning, the pivot element is in its final position.
3. Recursively sort the sub-array of lesser elements and the
sub-array of greater elements.
The invariant of classical Quicksort is:
[ <= p | >= p ]
There are several modifications of the schema:
[ < p | = p | > p ] or [ = p | < p | > p | = p ]
But all of them use *one* pivot.
The new Dual-Pivot Quicksort uses *two* pivots elements in this manner:
1. Pick an elements P1, P2, called pivots from the array.
2. Assume that P1 <= P2, otherwise swap it.
3. Reorder the array into three parts: those less than the smaller
pivot, those larger than the larger pivot, and in between are
those elements between (or equal to) the two pivots.
4. Recursively sort the sub-arrays.
The invariant of the Dual-Pivot Quicksort is:
[ < P1 | P1 <= & <= P2 } > P2 ]
The new Quicksort is faster than current implementation of Quicksort
in JDK (L. Bentley and M. Douglas McIlroy) and classical Quicksort.
The full description of the Dual-Pivot Quicksort you can find
on my page: http://iaroslavski.narod.ru/quicksort
Performance tests
-----------------
Here is result of running on different types of input data:
Client VM all 85% organ 0..1 0..4
random ascend descend equal equal pipes random 010101 random
Dual-Pivot: 16.83 5.31 5.47 0.35 0.68 10.59 1.06 1.02 2.18
Bentley's: 19.77 9.08 10.13 0.63 1.12 13.22 1.63 1.08 2.49
Server VM all 85% organ 0..1 0..4
random ascend descend equal equal pipes random 010101 random
Dual-Pivot: 23.94 6.68 6.63 0.43 0.62 17.14 1.42 1.96 3.41
Bentley's: 25.20 10.18 10.32 2.07 1.33 16.72 2.95 1.82 3.39
The a lot of other tests have been run under client and server mode.
The most interesting is BentleyBasher test framework. It runs battery
of tests for all cases:
{ 100, 1000, 10000, 1000000 } x
{ sawtooth, rand, stagger, plateau, shuffle } x
{ ident, reverse, reverse_front, reverse_back, sort, dither}
where
100, ... , 1000000 - array length
sawtooth: x[i] =i%m
rand: x[i] = rand() % m
stagger: x[i] = (i*m + i) % n
plateau: x[i] = min(i, m)
shuffle: x[i] = rand()%m? (j+=2): (k+=2)
ident(x) - a copy of x
reverse(x, 0, n) - reversed copy
reverse_front(x, 0, n/2) - front half reversed
reverse_back(x, n/2, n) - back half reversed
sort(x) - an ordered copy
dither(x) - add i%5 to x[i]
Here is the result of execution:
Server VM: http://spreadsheets.google.com/pub?key=t_EAWUkQ4mD3BIbOv8Fa-AQ&output=html
Client VM: http://spreadsheets.google.com/pub?key=tdiMo8xleTxd23nKUObcz0Q&single=true&gid=0&output=html
Mathematical investigations
---------------------------
It is proved that for the Dual-Pivot Quicksort the average number of
comparisons is 2*n*ln(n), the average number of swaps is 0.8*n*ln(n),
whereas classical Quicksort algorithm has 2*n*ln(n) and 1*n*ln(n)
respectively. Full mathematical proof see in attached proof.txt
and proof_add.txt files. Theoretical results are also confirmed
by experimental counting of the operations.
Diff between current and new implementation of Quicksort
--------------------------------------------------------
Here is the link to the diff for java.util.Arrays class:
http://cr.openjdk.java.net/~alanb/6880672/webrev.00
If you like to look and play with new algorithm,
please, take attached class DualPivotQuicksort.java
Feedback
--------
Also I'd like to share a feedback from Joshua Bloch and
Jon Bentley who spent a lot of time investigating this
algorithm, who gave me many advices and tips how to
make new Quicksort better.
-------- Original Message --------
Subject: Re: Integration of new Dual-Pivot Quicksort into JDK 7
Date: Thu, 10 Sep 2009 07:20:11 -0700
From: Joshua Bloch <jjb at google.com>
Jon also says that Vladimir should make every reasonable improvement to
the basic method before checking in the code. In his words, "It would be
horrible to put the new code into the library, and then have someone
else come along and speed it up by another 20% by using standard
techniques." I believe it's not unlikely that this code may end up
getting ported to many languages and widely deployed in much the manner
of Bentley and McIlroy's fine sort (which is nearing 20 successful years
in the field). Jon will help Vladimir do this.
-------- Original Message --------
Subject: Dual-Pivot Quicksort: Next Steps
Date: Wed, 09 Sep 2009 15:02:25 -0400
From: Jon Bentley <jbentley at avaya.com>
Vladimir, Josh,
I *finally* feel like I understand what is going on. Now that I
(think that) I see it, it seems straightforward and obvious.
Tony Hoare developed Quicksort in the early 1960s. I was very
proud to make minor contributions to a particularly clean (binary)
quicksort in the mid 1980s, to a relatively straightforward, industrial
strength Quicksort with McIlroy in the early 1990s, and then to
algorithms and data structures for strings with Sedgewick in the mid 1990s.
I think that Vladimir's contributions to Quicksort go way beyond
anything that I've ever done, and rank up there with Hoare's original
design and Sedgewick's analysis. I feel so privileged to play a very,
very minor role in helping Vladimir with the most excellent work!
-----------------------------------------------
Let me know, if you have any questions/comments.
Thank you,
Vladimir
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: proof_add.txt
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20090911/6c3ba31b/proof_add.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: DualPivotQuicksort.java
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20090911/6c3ba31b/DualPivotQuicksort.java>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: proof.txt
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20090911/6c3ba31b/proof.txt>
More information about the core-libs-dev
mailing list