# Re[2]: New portion of improvements for Dual-Pivot Quicksort

Thu Jun 3 14:17:57 PDT 2010

```Hello,

I tried your case (which is selection sort) and it works as expected: not worse
than "network" or "bubble" sorting. But nevertheless, the best choice is to use
insertion sort, I wrote more elegant implementation, see:

///int ae1 = a[e1], ae3 = a[e3], ae5 = a[e5], ae2 = a[e2], ae4 = a[e4];

// Sort these elements using insertion sort
if (a[e2] < a[e1]) { int t = a[e2]; a[e2] = a[e1]; a[e1] = t; }

if (a[e3] < a[e2]) { int t = a[e3]; a[e3] = a[e2]; a[e2] = t;
if (t < a[e1]) { a[e2] = a[e1]; a[e1] = t; }
}
if (a[e4] < a[e3]) { int t = a[e4]; a[e4] = a[e3]; a[e3] = t;
if (t < a[e2]) { a[e3] = a[e2]; a[e2] = t;
if (t < a[e1]) { a[e2] = a[e1]; a[e1] = t; }
}
}
if (a[e5] < a[e4]) { int t = a[e5]; a[e5] = a[e4]; a[e4] = t;
if (t < a[e3]) { a[e4] = a[e3]; a[e3] = t;
if (t < a[e2]) { a[e3] = a[e2]; a[e2] = t;
if (t < a[e1]) { a[e2] = a[e1]; a[e1] = t; }
}
}
}

///a[e1] = ae1; a[e3] = ae3; a[e5] = ae5; a[e2] = ae2; a[e4] = ae4;

Note that this implementation doesn't use local variables ae1, .. , ae5
at all, and without variables it works faster. This code is not too long,
extra 4 lines only. And if on client VM it works as other "network"
implementations, but on server VM it wins 1.2%.

In compare with first implementation of Dual-Pivot Quicksort, which
is used now in JDK 7, suggested version wins ~15% and 6% for client
and server modes.

Updated version of the class I will send tomorrow.

Dmytro,
could you please look at suggested insertion sort for 5 elements?

Do you have any comments/improvements? One place to be improved
is last two ifs "if (a[e4] < ..." and "if (a[e5] < ..." where
element is compared with all sorted elements, whereas we can save
comparisons by binary fork. But implementation becomes too complex
and long.

As it can be expected, the best sorting for small arrays is insertion,
then selection and then only bubble sort, even for 5 elements.

Best regards,

Wed, 2 Jun 2010 22:21:41 +0700 письмо от Dmytro Sheyko <dmytro_sheyko at hotmail.com>:

> Sure,
>
> The average number of swaps is 5 = 600/120.
> // bubble
> // [4 5][3 4][2 3][1 2][2 3][3 4][4 5][3 4][2 3][3 4]
> // size   10
> // swaps  600
> // delay  10
> // n!  = (60/120=50%)(80/120=66%)(90/120=75%)(96/120=80%)(54/120=45%)(72/120=60%)(72/120=60%)(28/120=23%)(36/120=30%)(12/120=10%)[600/1200=50%]
> // 2^n = (8/32=25%)(12/32=37%)(14/32=43%)(15/32=46%)(7/32=21%)(9/32=28%)(7/32=21%)(3/32=9%)(4/32=12%)(1/32=3%)[80/320=25%]
>
> It seems that such property as number of swaps does not really matter too much. I can conjecture that what really matters here is the number of loads.
> I guess that CPU cannot execute two compare-and-swap steps concurrently, so minimizing delay is senseless and even harmful.
> On the contrary, we can minimize number of loads (assuming that all 5 elements cannot be housed in CPU registers) if every compare-and-swap step uses one of variables that its predecessor does.
>
> Could you also try schema below? I expect that it should not be worse than "bubble network". It has the same delay, but the average number of swaps (3) is low.
> // narrowing
> // [1 5][1 4][1 3][1 2][2 5][2 4][2 3][3 5][3 4][4 5]
> // size   10
> // swaps  360
> // delay  10
> // n!  = (60/120=50%)(40/120=33%)(30/120=25%)(24/120=20%)(40/120=33%)(40/120=33%)(36/120=30%)(30/120=25%)(36/120=30%)(24/120=20%)[360/1200=30%]
> // 2^n = (8/32=25%)(4/32=12%)(2/32=6%)(1/32=3%)(4/32=12%)(4/32=12%)(3/32=9%)(2/32=6%)(3/32=9%)(1/32=3%)[32/320=10%]
> if (ae1 > ae5) { t = ae1; ae1 = ae5; ae5 = t; }
> if (ae1 > ae4) { t = ae1; ae1 = ae4; ae4 = t; }
> if (ae1 > ae3) { t = ae1; ae1 = ae3; ae3 = t; }
> if (ae1 > ae2) { t = ae1; ae1 = ae2; ae2 = t; }
> if (ae2 > ae5) { t = ae2; ae2 = ae5; ae5 = t; }
> if (ae2 > ae4) { t = ae2; ae2 = ae4; ae4 = t; }
> if (ae2 > ae3) { t = ae2; ae2 = ae3; ae3 = t; }
> if (ae3 > ae5) { t = ae3; ae3 = ae5; ae5 = t; }
> if (ae3 > ae4) { t = ae3; ae3 = ae4; ae4 = t; }
> if (ae4 > ae5) { t = ae4; ae4 = ae5; ae5 = t; }
>
> I also don't mind to change current sorting network to something more efficient (i.e. bubble network or so). However its impact on the whole sorting algorithm seems neglectable especially for large arrays.
>
> Thank you,
> Dmytro Sheyko
>
> > Date: Wed, 2 Jun 2010 17:54:57 +0400
> > From: iaroslavski at mail.ru
> > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
> > To: dmytro_sheyko at hotmail.com
> > CC: joshua.bloch at google.com; core-libs-dev at openjdk.java.net
> >
> > I checked sorting for long type and see the same behaviour as for int,
> > new suggested sorting network doesn't win.
> >
> > 10-steps bubble sort shows the almost the same results as on int:
> > wins ~0.3% on client, and 0.8% on server.
> >
> > Should we use 10-steps bubble sort instead of sorting network
> > (which has bubble sorting structure as well)? For code it will be
> > extra one line.
> >
> > Dmytro,
> > Could you please, run your counting test on bubble sort?
> > What is the average count of swaps?
> >
> > The schema is [4 5] [3 4] [2 3] [1 2] [2 3] [3 4] [4 5] [3 4] [2 3] [3 4]
> >
> > Dmytro Sheyko wrote:
> > >
> > > > Which sorting algorithm we should use?
> > > >
> > > > 1. Network - compact, 9 lines of code
> > > > 2. Bubble - also compact, 10 lines of code
> > > > 3. Insertion - faster, but 39 lines of code
> > >
> > > I think that the gain is not worth the complexity. So, maybe, just leave
> > > it as it is.
> > >
> > > > Date: Tue, 1 Jun 2010 15:40:19 +0400
> > > > From: iaroslavski at mail.ru
> > > > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
> > > > To: dmytro_sheyko at hotmail.com
> > > > CC: joshua.bloch at google.com; core-libs-dev at openjdk.java.net
> > > >
> > > > Hi Dmytro,
> > > >
> > > > Very interesting investigation, thank you very much!
> > > >
> > > > I tried you suggested sorting network, but all versions work the same.
> > > > Then I used bubble sort (10 comparisons) and it shows better
> > > > results: on client it is faster on 0.2%, server - 0.8%, not too
> > > > much, but faster.
> > > >
> > > > The schema is (bubble sort with changes of direction):
> > > > [4 5] [3 4] [2 3] [1 2] [2 3] [3 4] [4 5] [3 4] [2 3] [3 4]
> > > >
> > > > And moreover: if we use insertion sort for sorting of 5 candidates:
> > > >
> > > > int ae1 = a[e1], ae3 = a[e3], ae5 = a[e5], ae2 = a[e2], ae4 = a[e4];
> > > >
> > > > if (ae2 < a[e1]) {
> > > > a[e2] = a[e1]; a[e1] = ae2;
> > > > }
> > > > if (ae3 < a[e2]) {
> > > > if (ae3 < a[e1]) {
> > > > a[e3] = a[e2]; a[e2] = a[e1]; a[e1] = ae3;
> > > > } else {
> > > > a[e3] = a[e2]; a[e2] = ae3;
> > > > }
> > > > }
> > > > if (ae4 < a[e2]) {
> > > > if (ae4 < a[e1]) {
> > > > a[e4] = a[e3]; a[e3] = a[e2]; a[e2] = a[e1]; a[e1] = ae4;
> > > > } else {
> > > > a[e4] = a[e3]; a[e3] = a[e2]; a[e2] = ae4;
> > > > }
> > > > } else {
> > > > if (ae4 < a[e3]) {
> > > > a[e4] = a[e3]; a[e3] = ae4;
> > > > }
> > > > }
> > > > if (ae5 < a[e2]) {
> > > > if (ae5 < a[e1]) {
> > > > a[e5]=a[e4];a[e4]=a[e3];a[e3]=a[e2];a[e2]=a[e1];a[e1]=ae5;
> > > > } else {
> > > > a[e5] = a[e4]; a[e4] = a[e3]; a[e3] = a[e2]; a[e2] = ae5;
> > > > }
> > > > } else {
> > > > if (ae5 < a[e3]) {
> > > > a[e5] = a[e4]; a[e4] = a[e3]; a[e3] = ae5;
> > > > }
> > > > else {
> > > > if (ae5 < a[e4]) {
> > > > a[e5] = a[e4]; a[e4] = ae5;
> > > > }
> > > > }
> > > > }
> > > >
> > > > it shows better results than bubble sort:
> > > > client - 0.37%, server - 1.03%
> > > >
> > > > Which sorting algorithm we should use?
> > > >
> > > > 1. Network - compact, 9 lines of code
> > > > 2. Bubble - also compact, 10 lines of code
> > > > 3. Insertion - faster, but 39 lines of code
> > > >
> > > > Regards,
> > > >
> > > > Dmytro Sheyko wrote:
> > > > > Corrections.
> > > > > 1. The sixth comparator in current network (that swaps with 90%
> > > > > probability) is [2 3] not [0 3].
> > > > > 2. The average number of swaps are
> > > > > 576/120 = 4.8 for current network and
> > > > > 376/120 = 3.1(3...) for proposed network.
> > > > >
> > > > >
> > > ------------------------------------------------------------------------
> > > > > From: dmytro_sheyko at hotmail.com
> > > > > To: iaroslavski at mail.ru; joshua.bloch at google.com
> > > > > Subject: RE: New portion of improvements for Dual-Pivot Quicksort
> > > > > Date: Tue, 1 Jun 2010 14:59:07 +0700
> > > > > CC: core-libs-dev at openjdk.java.net
> > > > >
> > > > > Hi Vladimir,
> > > > >
> > > > > I cannot write 7-comparison sort briefly as well. However I think
> > > we can
> > > > > optimize sorting network a bit.
> > > > > We can consider such properties as
> > > > > 1. size, i.e. number of comparators
> > > > > 2. delay, i.e. number of parallel steps (some comparisons and swaps
> > > can
> > > > > be done concurrently)
> > > > > 3. average number of swaps.
> > > > >
> > > > > The current sorting network
> > > > > [0 1][3 4] | [0 2] | [1 2][0 3] | [2 3][1 4] | [1 2][3 4]
> > > > > has size 9 and delay 5.
> > > > > These values are good enough. Generally we cannot write it shorter (in
> > > > > order to improve size) and combine more than 2 comparison in a
> > > parallel
> > > > > step (in order to improve delay).
> > > > >
> > > > > However the average number of swaps seems too high. It performs 576
> > > > > swaps per 1080 comparisons (1080 = 9 {size} * 120 {5!})
> > > > > I tried every permutation of 5 distinct values.
> > > > > There is a profile per comparator:
> > > > >
> > > (60/120=50%)(60/120=50%)(40/120=33%)(80/120=66%)(48/120=40%)(108/120=90%)(36/120=30%)(72/120=60%)(72/120=60%)[576/1080=53%]
> > > > > The first two comparison offer probability of swap as 50%, which is
> > > > > natural. However the sixth comparator (i.e. [0 3]) swaps with
> > > > > probability 90%, i.e. almost always!
> > > > >
> > > > > The one of the best sorting network (I have found) with the same size
> > > > > and delay is following
> > > > > [0 4] | [0 2][1 4] | [1 3][2 4] | [0 1][2 3] | [1 2][3 4]
> > > > > Its average number of swaps is 376.
> > > > > And here is a profile:
> > > > >
> > > (60/120=50%)(40/120=33%)(40/120=33%)(50/120=41%)(30/120=25%)(48/120=40%)(48/120=40%)(36/120=30%)(24/120=20%)[376/1080=34%]
> > > > >
> > > > > Regards,
> > > > > Dmytro Sheyko
> > > > >
> > > > > > Date: Wed, 26 May 2010 16:12:17 +0400
> > > > > > From: iaroslavski at mail.ru
> > > > > > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
> > > > > > To: dmytro_sheyko at hotmail.com; Joshua.Bloch at google.com
> > > > > > CC: core-libs-dev at openjdk.java.net
> > > > > >
> > > > > > Hello Dmytro,
> > > > > >
> > > > > > Theoretical investigations are based on simple model,
> > > > > > which doesn't take into account many optimizations
> > > > > > like "equal pivots", "scan equal to pivots", etc.
> > > > > > Model based on the implementation will be too complex
> > > > > > to be analyzed.
> > > > > >
> > > > > > But it is easy to count comparisons and assignments,
> > > > > > if I change them by functions with counter.
> > > > > >
> > > > > > Also I tried to write 7-comparison sort, but the code
> > > > > > was too long (a lot of inner if-then-else) instead of
> > > > > > 9 compact lines. Do you have suggestion how to implement
> > > > > > a nice 7-comparison sort? I tried also selection and bubble
> > > > > > sorts, but insertion sort shows better time.
> > > > > >
> > > > > > Josh,
> > > > > > Could you please review the last changes (especially javadoc
> > > > > > and comments)?
> > > > > >
> > > > > > Thank you,
> > > > > > Vladimir
> > > > > >
> > > > > > Dmytro Sheyko wrote:
> > > > > > > Hi Vladimir,
> > > > > > >
> > > > > > > As for me, everything seems good.
> > > > > > >
> > > > > > > Returning to the theoretical background, could you estimate
> > > number of
> > > > > > > comparison and assignments? These should be less than in your
> > > initial
> > > > > > > version.
> > > > > > >
> > > > > > > Also have you considered 7-comparison sort for sorting 5 pivot
> > > > > > > candidates instead of 9-comparison sorting network?
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Dmytro Sheyko
> > > > > > >
> > > > > > > > Date: Tue, 25 May 2010 10:42:51 +0400
> > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
> > > > > > > > To: dmytro_sheyko at hotmail.com
> > > > > > > > CC: core-libs-dev at openjdk.java.net
> > > > > > > >
> > > > > > > >
> > > > > > > > >> So, we win 2-3% !
> > > > > > > >
> > > > > > > > On [partial] sorted inputs new version runs more faster than few
> > > > > > > percents:
> > > > > > > >
> > > > > > > > organ pipes
> > > > > > > > this: 6896
> > > > > > > > prev: 7424
> > > > > > > > jdk7: 8018
> > > > > > > > jdk6: 12502
> > > > > > > >
> > > > > > > > ascendant
> > > > > > > > this: 2877
> > > > > > > > prev: 3845
> > > > > > > > jdk7: 4583
> > > > > > > > jdk6: 9019
> > > > > > > >
> > > > > > > > descendant
> > > > > > > > this: 3287
> > > > > > > > prev: 4110
> > > > > > > > jdk7: 4897
> > > > > > > > jdk6: 9132
> > > > > > > >
> > > > > > > > Dmytro Sheyko wrote:
> > > > > > > > > That's great! Thank you.
> > > > > > > > >
> > > > > > > > > > Date: Fri, 21 May 2010 18:38:51 +0400
> > > > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > > > Subject: Re: New portion of improvements for Dual-Pivot
> > > Quicksort
> > > > > > > > > > To: dmytro_sheyko at hotmail.com
> > > > > > > > > > CC: core-libs-dev at openjdk.java.net
> > > > > > > > > >
> > > > > > > > > > Hello,
> > > > > > > > > >
> > > > > > > > > > I prepared version with your changes "Skip the last negative
> > > > > > > > > > value (if any) or all leading negative" and with my
> > > optimization
> > > > > > > > > > for all types. I added two while loops before partitioning to
> > > > > > > > > > skip elements, less than pivot1 and greater than pivot2:
> > > > > > > > > >
> > > > > > > > > > if (pivot1 != pivot2) {
> > > > > > > > > > /* ... */
> > > > > > > > > > a[e2] = a[less];
> > > > > > > > > > a[e4] = a[great];
> > > > > > > > > >
> > > > > > > > > > ++ while (a[++less] < pivot1);
> > > > > > > > > > ++ while (a[--great] > pivot2);
> > > > > > > > > >
> > > > > > > > > > /* ... */
> > > > > > > > > > outer:
> > > > > > > > > > for (int k = less; k <= great; k++) {
> > > > > > > > > > ...
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > Here is benchmark result (in compare with quicksort from
> > > JDK 6):
> > > > > > > > > >
> > > > > > > > > > client server
> > > > > > > > > > ------ ------
> > > > > > > > > > previous version: 60.70% 48.20%
> > > > > > > > > > current version: 57.22% 46.18%
> > > > > > > > > >
> > > > > > > > > > So, we win 2-3% !
> > > > > > > > > >
> > > > > > > > > > Thank you,
> > > > > > > > > > Vladimir
> > > > > > > > > >
> > > > > > > > > > Dmytro Sheyko wrote:
> > > > > > > > > > > Hi Vladimir,
> > > > > > > > > > >
> > > > > > > > > > > I tried to figure out why the testcase failed on my
> > > > > > > modification. It
> > > > > > > > > > > appeared that number of negative zeros were changed during
> > > > > general
> > > > > > > > > sort.
> > > > > > > > > > > As I can see you already fixed this issue. Well, my
> > > > > > > modification was
> > > > > > > > > > > based on assumption that we can speed up eliminating
> > > > > explicit array
> > > > > > > > > > > range checks.
> > > > > > > > > > > However, such assumption is wrong because Hotspot
> > > anyway emits
> > > > > > > range
> > > > > > > > > > > checks at its discretion and therefore processZeros
> > > generally
> > > > > > > does not
> > > > > > > > > > > work as fast as I expected.
> > > > > > > > > > > So complications I made are not worth doing.
> > > > > > > > > > >
> > > > > > > > > > > As for the latest code you posted. Doesn't it make
> > > sense to
> > > > > skip
> > > > > > > > > leading
> > > > > > > > > > > negative zeros before farther processing? In this case
> > > we avoid
> > > > > > > > > > > unnecessary assigning +0.0 and then -0.0 to the same
> > > > > location a[k]
> > > > > > > > > (i.e.
> > > > > > > > > > > where k == p).
> > > > > > > > > > >
> > > > > > > > > > > /*
> > > > > > > > > > > * Skip the last negative value (if any) or all leading
> > > negative
> > > > > > > > > > > zeros
> > > > > > > > > > > */
> > > > > > > > > > > while (left <= right &&
> > > Double.doubleToRawLongBits(a[left])
> > > > > < 0) {
> > > > > > > > > > > left++;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > for (int k = left + 1, p = left; k <= right; k++) {
> > > > > > > > > > > double ak = a[k];
> > > > > > > > > > > if (ak != 0.0d) {
> > > > > > > > > > > return;
> > > > > > > > > > > }
> > > > > > > > > > > if (Double.doubleToRawLongBits(ak) < 0) { // ak is -0.0d
> > > > > > > > > > > a[k] = 0.0d;
> > > > > > > > > > > a[p++] = -0.0d;
> > > > > > > > > > > }
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > Thank you,
> > > > > > > > > > > Dmytro Sheyko
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Date: Wed, 19 May 2010 14:41:32 +0400
> > > > > > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > > > > > Subject: Re: New portion of improvements for Dual-Pivot
> > > > > Quicksort
> > > > > > > > > > > > To: dmytro_sheyko at hotmail.com
> > > > > > > > > > > > CC: core-libs-dev at openjdk.java.net
> > > > > > > > > > > >
> > > > > > > > > > > > resend the class with correct constructor
> > > > > > > > > > > >
> > > > > > > > > > > > Vladimir Iaroslavski wrote:
> > > > > > > > > > > > > Dmytro,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you for comments, I updated double method, did
> > > > > little bit
> > > > > > > > > > > > > javadoc changes and replaced in char/short/byte methods
> > > > > > > > > > > > > "fromIndex -> left", "toIndex-1 -> right", the code
> > > became
> > > > > > > > > > > > > consistent with main sort method and more compact.
> > > Also
> > > > > I use
> > > > > > > > > > > > > more usual "i--" and "i++" in for loops (instead of
> > > "--i",
> > > > > > > "++i.
> > > > > > > > > > > > >
> > > > > > > > > > > > > To accent the difference between float/double and
> > > other
> > > > > types,
> > > > > > > > > > > > > I put comment where it is important:
> > > > > > > > > > > > >
> > > > > > > > > > > > > /*
> > > > > > > > > > > > > * In spite of a[great] == pivot1, the assignment
> > > > > > > > > > > > > * a[less++] = pivot1 may be incorrect, if a[great]
> > > > > > > > > > > > > * and pivot1 are floating-point zeros of different
> > > > > > > > > > > > > * signs, therefore in float/double methods we have
> > > > > > > > > > > > > * to use more accurate assignment a[k] = a[great].
> > > > > > > > > > > > > */
> > > > > > > > > > > > > a[less++] = pivot1;
> > > > > > > > > > > > >
> > > > > > > > > > > > > and for double/float:
> > > > > > > > > > > > >
> > > > > > > > > > > > > /*
> > > > > > > > > > > > > .....
> > > > > > > > > > > > > */
> > > > > > > > > > > > > a[k] = a[great];
> > > > > > > > > > > > >
> > > > > > > > > > > > > See updated version in attachment.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you,
> > > > > > > > > > > > > Vladimir
> > > > > > > > > > > > >
> > > > > > > > > > > > > Dmytro Sheyko wrote:
> > > > > > > > > > > > >> Vladimir,
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> I can see that you changed
> > > > > sortNegZeroAndNaN(float[]...) but
> > > > > > > > > probably
> > > > > > > > > > > > >> forgot to change sortNegZeroAndNaN(double[]...).
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> You really puzzled me with failed testcase and
> > > note that
> > > > > > > sorting
> > > > > > > > > > > > >> algorithm (without special attention to zeros)
> > > > > generally may
> > > > > > > > > change
> > > > > > > > > > > > >> number of negative zeros.
> > > > > > > > > > > > >> I will provide my comments later.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> As for counting sort, I think we should use single
> > > format
> > > > > > > > > style over
> > > > > > > > > > > > >> the file (unless we have valuable reason not to do
> > > > > this). I
> > > > > > > > > mean to
> > > > > > > > > > > > >> choose
> > > > > > > > > > > > >> 1)
> > > > > > > > > > > > >> if (toIndex - fromIndex >
> > > > > > > > > > > > >> COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR) {
> > > > > > > > > > > > >> countingSort(a, fromIndex, toIndex);
> > > > > > > > > > > > >> return;
> > > > > > > > > > > > >> }
> > > > > > > > > > > > >> sort(a, fromIndex, toIndex - 1, true);
> > > > > > > > > > > > >> 2)
> > > > > > > > > > > > >> if (toIndex - fromIndex >
> > > > > > > > > > > > >> COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR) {
> > > > > > > > > > > > >> countingSort(a, fromIndex, toIndex);
> > > > > > > > > > > > >> } else {
> > > > > > > > > > > > >> sort(a, fromIndex, toIndex - 1, true);
> > > > > > > > > > > > >> }
> > > > > > > > > > > > >> I prefer the second one.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> Thanks a lot,
> > > > > > > > > > > > >> Dmytro Sheyko
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> > Date: Tue, 18 May 2010 18:57:50 +0400
> > > > > > > > > > > > >> > From: iaroslavski at mail.ru
> > > > > > > > > > > > >> > Subject: Re: New portion of improvements for
> > > Dual-Pivot
> > > > > > > > > Quicksort
> > > > > > > > > > > > >> > To: dmytro_sheyko at hotmail.com
> > > > > > > > > > > > >> > CC: core-libs-dev at openjdk.java.net
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Hello,
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > I've run your modification for counting sort, it
> > > real
> > > > > > > faster.
> > > > > > > > > > > > >> > I attached new version with your changes (I did
> > > > > little bit
> > > > > > > > > > > > >> > format it) and included my case with float/double.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Note that you modification doesn't pass test from
> > > > > > > Sorting class,
> > > > > > > > > > > > >> > which I sent earlier. It fails on float/double test:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Test #3: random = 666, len = 34, a = 0, g = 6, z =
> > > > > 9, n =
> > > > > > > > > 10, p = 9
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > I suggest shorter method (which is based on your
> > > > > idea to
> > > > > > > skip
> > > > > > > > > > > counting
> > > > > > > > > > > > >> > negative zeros on Phase 1.): I found find first zero
> > > > > > > index (or
> > > > > > > > > > > it will
> > > > > > > > > > > > >> > be index of first positive element if no zeros
> > > at all,
> > > > > > > or last
> > > > > > > > > > > > >> negative,
> > > > > > > > > > > > >> > if no positive and zero elements) and then swap
> > > negative
> > > > > > > > > zero to the
> > > > > > > > > > > > >> > beginning of the sub-range.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > int hi = right;
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > while (left < hi) {
> > > > > > > > > > > > >> > int middle = (left + hi) >>> 1;
> > > > > > > > > > > > >> > float middleValue = a[middle];
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > if (middleValue < 0.0f) {
> > > > > > > > > > > > >> > left = middle + 1;
> > > > > > > > > > > > >> > } else {
> > > > > > > > > > > > >> > hi = middle;
> > > > > > > > > > > > >> > }
> > > > > > > > > > > > >> > }
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > for (int k = left, p = left; k <= right; k++) {
> > > > > > > > > > > > >> > float ak = a[k];
> > > > > > > > > > > > >> > if (ak != 0.0f) {
> > > > > > > > > > > > >> > return;
> > > > > > > > > > > > >> > }
> > > > > > > > > > > > >> > if (Float.floatToRawIntBits(ak) < 0) { // ak is
> > > -0.0f
> > > > > > > > > > > > >> > a[k] = +0.0f;
> > > > > > > > > > > > >> > a[p++] = -0.0f;
> > > > > > > > > > > > >> > }
> > > > > > > > > > > > >> > }
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Important note: in partitioning loop there are
> > > several
> > > > > > > places
> > > > > > > > > > > > >> > (marked by // !) where potential bug with -0.0
> > > could be
> > > > > > > > > > > > >> > (when pivot and a[great] are zeros with different
> > > > > signs):
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > if (a[great] == pivot1) {
> > > > > > > > > > > > >> > a[k] = a[less];
> > > > > > > > > > > > >> > - a[less++] = pivot1; // !
> > > > > > > > > > > > >> > + a[less++] = a[great];
> > > > > > > > > > > > >> > } else { // pivot1 < a[great] < pivot2
> > > > > > > > > > > > >> > a[k] = a[great];
> > > > > > > > > > > > >> > }
> > > > > > > > > > > > >> > - a[great--] = pivot2; // !
> > > > > > > > > > > > >> > + a[great--] = ak;
> > > > > > > > > > > > >> > } else if (ak == pivot1) { // Move a[k] to left part
> > > > > > > > > > > > >> > a[k] = a[less];
> > > > > > > > > > > > >> > - a[less++] = pivot1; // !
> > > > > > > > > > > > >> > + a[less++] = ak;
> > > > > > > > > > > > >> > }
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > and the same in "Pivots are equal" branch.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > I did changes "pivot1/2 -> ak" in methods for
> > > all types
> > > > > > > > > > > > >> > and "pivot1 -> a[great]" in float/double
> > > sections only.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Please, review format changes for counting sort
> > > and new
> > > > > > > version
> > > > > > > > > > > > >> > of Phase 3 for float/double.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Thank you,
> > > > > > > > > > > > >> > Vladimir
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Dmytro Sheyko wrote:
> > > > > > > > > > > > >> > > Hi,
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > About counting sort again.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > 1. This condition "i < count.length && k <=
> > > right" is
> > > > > > > > > excessive.
> > > > > > > > > > > > >> Any one
> > > > > > > > > > > > >> > > conjunct is enough. "k <= right" seems better.
> > > > > > > > > > > > >> > > 2. No need to calculate "short value = (short)
> > > (i +
> > > > > > > > > > > > >> Short.MIN_VALUE)"
> > > > > > > > > > > > >> > > when "count[i]" is zero.
> > > > > > > > > > > > >> > > 3. For signed primitives (byte and short) we would
> > > > > > > better loop
> > > > > > > > > > > > >> backward.
> > > > > > > > > > > > >> > > Thanks to "k >= fromIndex" condition we will quit
> > > > > looping
> > > > > > > > > earlier
> > > > > > > > > > > > >> > > assuming that typically we work with positive
> > > numbers.
> > > > > > > > > > > > >> > > For unsigned primitives (char) we would better
> > > loop
> > > > > > > forward
> > > > > > > > > > > because
> > > > > > > > > > > > >> > > typically we work with characters about zero
> > > (ASCII).
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > - for (int i = 0, k = left; i < count.length
> > > && k <=
> > > > > > > > > right; i++) {
> > > > > > > > > > > > >> > > - short value = (short) (i + Short.MIN_VALUE);
> > > > > > > > > > > > >> > > - for (int s = count[i]; s > 0; s--) {
> > > > > > > > > > > > >> > > - a[k++] = value;
> > > > > > > > > > > > >> > > - }
> > > > > > > > > > > > >> > > - }
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > + for (int i = NUM_SHORT_VALUES - 1, k =
> > > toIndex -
> > > > > 1; k >=
> > > > > > > > > > > > >> > > fromIndex; --i) {
> > > > > > > > > > > > >> > > + while (count[i] == 0) --i;
> > > > > > > > > > > > >> > > + short value = (short) (i + Short.MIN_VALUE);
> > > > > > > > > > > > >> > > + int s = count[i];
> > > > > > > > > > > > >> > > + do { a[k--] = value; } while (--s > 0);
> > > > > > > > > > > > >> > > + }
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Thanks,
> > > > > > > > > > > > >> > > Dmytro Sheyko
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > > From: iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > To: dmytro_sheyko at hotmail.com
> > > > > > > > > > > > >> > > > CC: core-libs-dev at openjdk.java.net;
> > > > > iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > Subject: Re[2]: New portion of improvements for
> > > > > > > Dual-Pivot
> > > > > > > > > > > > >> Quicksort
> > > > > > > > > > > > >> > > > Date: Tue, 18 May 2010 01:11:19 +0400
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > Sounds good!
> > > > > > > > > > > > >> > > > Will consider too...
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > Mon, 17 May 2010 22:24:11 +0700 письмо от
> > > Dmytro
> > > > > Sheyko
> > > > > > > > > > > > >> > > <dmytro_sheyko at hotmail.com>:
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > > Hi,
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > Regarding counting sort. We can check
> > > whether we
> > > > > > > should
> > > > > > > > > > > > >> switch to
> > > > > > > > > > > > >> > > counting sort only once in the beginning.
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > > Date: Mon, 17 May 2010 17:30:37 +0400
> > > > > > > > > > > > >> > > > > > From: iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > > > Subject: Re: New portion of improvements for
> > > > > > > Dual-Pivot
> > > > > > > > > > > > >> Quicksort
> > > > > > > > > > > > >> > > > > > To: dmytro_sheyko at hotmail.com
> > > > > > > > > > > > >> > > > > > CC: core-libs-dev at openjdk.java.net
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Hello,
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Thank you for review, I'll check and run
> > > > > tests again
> > > > > > > > > > > with you
> > > > > > > > > > > > >> > > changes.
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Thank you,
> > > > > > > > > > > > >> > > > > > Vladimir
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Dmytro Sheyko wrote:
> > > > > > > > > > > > >> > > > > > > Hello,
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > More ideas.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > 1. We can use
> > > > > > > > > > > > >> > > > > > > Double.doubleToRawLongBits instead of
> > > > > > > > > > > > >> Double.doubleToLongBits and
> > > > > > > > > > > > >> > > > > > > Float.floatToRawIntBits instead of
> > > > > > > > > Float.floatToIntBits.
> > > > > > > > > > > > >> > > > > > > No need to handle NaN's because they
> > > all are
> > > > > > > placed to
> > > > > > > > > > > > >> the end
> > > > > > > > > > > > >> > > of array.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > 2. Note that
> > > > > > > > > > > > >> > > > > > > Double.doubleToRawLongBits(+0.0) == 0L and
> > > > > > > > > > > > >> > > > > > > Double.doubleToRawLongBits(-0.0) ==
> > > > > > > Long.MIN_VALUE and
> > > > > > > > > > > > >> > > > > > > Float.floatToRawIntBits(+0.0) == 0 and
> > > > > > > > > > > > >> > > > > > > Float.floatToRawIntBits(-0.0) ==
> > > > > > > Integer.MIN_VALUE.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > Comparing with is zero usually more
> > > efficient
> > > > > > > (or at
> > > > > > > > > > > > >> least not
> > > > > > > > > > > > >> > > worse)
> > > > > > > > > > > > >> > > > > > > than with other values. Thus such pattern
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > if (ak == 0.0f && NEGATIVE_ZERO ==
> > > > > > > > > > > Float.floatToIntBits(ak))
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > can be replaced with
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > if (ak == 0.0f &&
> > > Float.floatToIntBits(ak)
> > > > > < 0)
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > 3. It would be more efficient to count
> > > > > > > negative zeros
> > > > > > > > > > > after
> > > > > > > > > > > > >> > > sorting.
> > > > > > > > > > > > >> > > > > > > General sorting algorithm puts both
> > > > > negative and
> > > > > > > > > positive
> > > > > > > > > > > > >> zeros
> > > > > > > > > > > > >> > > together
> > > > > > > > > > > > >> > > > > > > (but maybe not in right order).
> > > > > > > > > > > > >> > > > > > > Therefore we have to process less
> > > elements
> > > > > because
> > > > > > > > > > > > >> usually we
> > > > > > > > > > > > >> > > have less
> > > > > > > > > > > > >> > > > > > > zeros than other numbers.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > Thanks,
> > > > > > > > > > > > >> > > > > > > Dmytro Sheyko
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > > > > > To: dmytro_sheyko at hotmail.com;
> > > > > jjb at google.com
> > > > > > > > > > > > >> > > > > > > > CC: core-libs-dev at openjdk.java.net;
> > > > > > > > > iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > > > > > Subject: Re[6]: New portion of
> > > > > improvements for
> > > > > > > > > > > Dual-Pivot
> > > > > > > > > > > > >> > > Quicksort
> > > > > > > > > > > > >> > > > > > > > Date: Fri, 14 May 2010 23:54:06 +0400
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > Hello,
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > I've updated the class, please,
> > > review the
> > > > > > > changes.
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > Vladimir
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > Fri, 14 May 2010 01:48:11 +0700
> > > письмо от
> > > > > > > Dmytro
> > > > > > > > > Sheyko
> > > > > > > > > > > > >> > > > > > > <dmytro_sheyko at hotmail..com>:
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Yes. I prefer F (Find First zero
> > > using
> > > > > binary
> > > > > > > > > search)
> > > > > > > > > > > > >> over
> > > > > > > > > > > > >> > > C (Count
> > > > > > > > > > > > >> > > > > > > negatives) and S (Smart Scan for zero).
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > > > > > > > To: dmytro_sheyko at hotmail.com
> > > > > > > > > > > > >> > > > > > > > > > CC: jjb at google.com;
> > > > > > > > > core-libs-dev at openjdk.java.net;
> > > > > > > > > > > > >> > > > > > > iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > > > > > > > Subject: Re[4]: New portion of
> > > > > > > improvements for
> > > > > > > > > > > > >> > > Dual-Pivot Quicksort
> > > > > > > > > > > > >> > > > > > > > > > Date: Thu, 13 May 2010 21:34:54
> > > +0400
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > Dmytro,
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > I've tested your suggested
> > > variants, and
> > > > > > > > > found that
> > > > > > > > > > > > >> case "C"
> > > > > > > > > > > > >> > > > > > > > > > (very interesting approach to
> > > find first
> > > > > > > > > position
> > > > > > > > > > > > >> of zero
> > > > > > > > > > > > >> > > > > > > > > > by counting negative elements) works
> > > > > > > slower than
> > > > > > > > > > > > >> original
> > > > > > > > > > > > >> > > > > > > > > > or two other cases.
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > Implementations "F" and "S" are
> > > very
> > > > > close
> > > > > > > > > to each
> > > > > > > > > > > > >> other
> > > > > > > > > > > > >> > > > > > > > > > and little bit faster than
> > > original. I
> > > > > > > > > prefer case
> > > > > > > > > > > > >> "F":
> > > > > > > > > > > > >> > > > > > > > > > it is shorter and more clear. Do
> > > you
> > > > > agree?
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > I'll prepare updated
> > > DualPivotQuicksort
> > > > > > > file and
> > > > > > > > > > > > >> send it
> > > > > > > > > > > > >> > > > > > > > > > tomorrow.
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > Thank you,
> > > > > > > > > > > > >> > > > > > > > > > Vladimir
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > Wed, 12 May 2010 17:04:52 +0700
> > > письмо
> > > > > > > от Dmytro
> > > > > > > > > > > > >> Sheyko
> > > > > > > > > > > > >> > > > > > > <dmytro_sheyko at hotmail.com>:
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > Vladimir,
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > Your changes are good for me.
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > Additionally I have some
> > > > > > > comments/proposals
> > > > > > > > > > > > >> regarding
> > > > > > > > > > > > >> > > dealing
> > > > > > > > > > > > >> > > > > > > with negative zeros.
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > 1. Scanning for the first zero
> > > we can
> > > > > > > > > avoid range
> > > > > > > > > > > > >> check
> > > > > > > > > > > > >> > > (i >=
> > > > > > > > > > > > >> > > > > > > left) if we have at least one negative
> > > value.
> > > > > > > > > > > > >> > > > > > > > > > > --- DualPivotQuicksort.java
> > > Tue May 11
> > > > > > > > > > > 09:04:19 2010
> > > > > > > > > > > > >> > > > > > > > > > > +++ DualPivotQuicksortS.java Wed
> > > > > May 12
> > > > > > > > > 12:10:46
> > > > > > > > > > > > >> 2010
> > > > > > > > > > > > >> > > > > > > > > > > @@ -1705,10 +1705,15 @@
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > // Find first zero element
> > > > > > > > > > > > >> > > > > > > > > > > - int zeroIndex = findAnyZero(a,
> > > > > left, n);
> > > > > > > > > > > > >> > > > > > > > > > > + int zeroIndex = 0;
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > - for (int i = zeroIndex - 1; i >=
> > > > > > > left &&
> > > > > > > > > a[i] ==
> > > > > > > > > > > > >> > > 0.0f; i--) {
> > > > > > > > > > > > >> > > > > > > > > > > - zeroIndex = i;
> > > > > > > > > > > > >> > > > > > > > > > > + if (a[left] < 0.0f) {
> > > > > > > > > > > > >> > > > > > > > > > > + zeroIndex = findAnyZero(a,
> > > left, n);
> > > > > > > > > > > > >> > > > > > > > > > > +
> > > > > > > > > > > > >> > > > > > > > > > > + // there is at least one
> > > negative
> > > > > > > value, so
> > > > > > > > > > > range
> > > > > > > > > > > > >> > > check is
> > > > > > > > > > > > >> > > > > > > not needed
> > > > > > > > > > > > >> > > > > > > > > > > + for (int i = zeroIndex - 1;
> > > /*i >=
> > > > > > > left &&*/
> > > > > > > > > > > > >> a[i] ==
> > > > > > > > > > > > >> > > 0.0f; i--) {
> > > > > > > > > > > > >> > > > > > > > > > > + zeroIndex = i;
> > > > > > > > > > > > >> > > > > > > > > > > + }
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > // Turn the right number of
> > > > > positive zeros
> > > > > > > > > > > back into
> > > > > > > > > > > > >> > > negative zeros
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > 2. We can find the position of
> > > the
> > > > > first
> > > > > > > > > zero by
> > > > > > > > > > > > >> counting
> > > > > > > > > > > > >> > > > > > > negative values during preprocessing
> > > phase.
> > > > > > > > > > > > >> > > > > > > > > > > --- DualPivotQuicksort.java
> > > Tue May 11
> > > > > > > > > > > 09:04:19 2010
> > > > > > > > > > > > >> > > > > > > > > > > +++ DualPivotQuicksortC.java Wed
> > > > > May 12
> > > > > > > > > 12:01:24
> > > > > > > > > > > > >> 2010
> > > > > > > > > > > > >> > > > > > > > > > > @@ -1678,7 +1678,7 @@
> > > > > > > > > > > > >> > > > > > > > > > > * Phase 1: Count negative zeros
> > > > > and move
> > > > > > > > > NaNs to
> > > > > > > > > > > > >> end of
> > > > > > > > > > > > >> > > array.
> > > > > > > > > > > > >> > > > > > > > > > > */
> > > > > > > > > > > > >> > > > > > > > > > > final int NEGATIVE_ZERO =
> > > > > > > > > > > > >> Float.floatToIntBits(-0.0f);
> > > > > > > > > > > > >> > > > > > > > > > > - int numNegativeZeros = 0;
> > > > > > > > > > > > >> > > > > > > > > > > + int numNegativeZeros = 0,
> > > > > > > > > numNegativeValues = 0;
> > > > > > > > > > > > >> > > > > > > > > > > int n = right;
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > for (int k = left; k <= n; k++) {
> > > > > > > > > > > > >> > > > > > > > > > > @@ -1689,6 +1689,8 @@
> > > > > > > > > > > > >> > > > > > > > > > > } else if (ak != ak) { //
> > > i.e., ak
> > > > > is NaN
> > > > > > > > > > > > >> > > > > > > > > > > a[k--] = a[n];
> > > > > > > > > > > > >> > > > > > > > > > > a[n--] = Float.NaN;
> > > > > > > > > > > > >> > > > > > > > > > > + } else if (ak < 0.0f) {
> > > > > > > > > > > > >> > > > > > > > > > > + numNegativeValues++;
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > @@ -1705,7 +1707,7 @@
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > // Find first zero element
> > > > > > > > > > > > >> > > > > > > > > > > - int zeroIndex = findAnyZero(a,
> > > > > left, n);
> > > > > > > > > > > > >> > > > > > > > > > > + int zeroIndex =
> > > numNegativeValues;
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > for (int i = zeroIndex - 1; i >=
> > > > > left &&
> > > > > > > > > a[i] ==
> > > > > > > > > > > > >> 0.0f;
> > > > > > > > > > > > >> > > i--) {
> > > > > > > > > > > > >> > > > > > > > > > > zeroIndex = i;
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > 3. We can use binary search to
> > > find
> > > > > > > the first
> > > > > > > > > > > > >> zero and
> > > > > > > > > > > > >> > > thus
> > > > > > > > > > > > >> > > > > > > avoid linear scan.
> > > > > > > > > > > > >> > > > > > > > > > > --- DualPivotQuicksort.java
> > > Tue May 11
> > > > > > > > > > > 09:04:19 2010
> > > > > > > > > > > > >> > > > > > > > > > > +++ DualPivotQuicksortF.java Wed
> > > > > May 12
> > > > > > > > > 12:03:58
> > > > > > > > > > > > >> 2010
> > > > > > > > > > > > >> > > > > > > > > > > @@ -1705,11 +1705,7 @@
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > // Find first zero element
> > > > > > > > > > > > >> > > > > > > > > > > - int zeroIndex = findAnyZero(a,
> > > > > left, n);
> > > > > > > > > > > > >> > > > > > > > > > > -
> > > > > > > > > > > > >> > > > > > > > > > > - for (int i = zeroIndex - 1; i >=
> > > > > > > left &&
> > > > > > > > > a[i] ==
> > > > > > > > > > > > >> > > 0.0f; i--) {
> > > > > > > > > > > > >> > > > > > > > > > > - zeroIndex = i;
> > > > > > > > > > > > >> > > > > > > > > > > - }
> > > > > > > > > > > > >> > > > > > > > > > > + int zeroIndex = findFirstZero(a,
> > > > > > > left, n);
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > // Turn the right number of
> > > > > positive zeros
> > > > > > > > > > > back into
> > > > > > > > > > > > >> > > negative zeros
> > > > > > > > > > > > >> > > > > > > > > > > for (int i = zeroIndex, m =
> > > > > zeroIndex +
> > > > > > > > > > > > >> > > numNegativeZeros; i <
> > > > > > > > > > > > >> > > > > > > m; i++) {
> > > > > > > > > > > > >> > > > > > > > > > > @@ -1718,7 +1714,7 @@
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > /**
> > > > > > > > > > > > >> > > > > > > > > > > - * Returns the index of some zero
> > > > > > > element
> > > > > > > > > in the
> > > > > > > > > > > > >> > > specified
> > > > > > > > > > > > >> > > > > > > range via
> > > > > > > > > > > > >> > > > > > > > > > > + * Returns the index of the
> > > first
> > > > > zero
> > > > > > > > > element
> > > > > > > > > > > > >> in the
> > > > > > > > > > > > >> > > > > > > specified range via
> > > > > > > > > > > > >> > > > > > > > > > > * binary search. The range is
> > > assumed
> > > > > > > to be
> > > > > > > > > > > > >> sorted, and
> > > > > > > > > > > > >> > > must
> > > > > > > > > > > > >> > > > > > > contain
> > > > > > > > > > > > >> > > > > > > > > > > * at least one zero.
> > > > > > > > > > > > >> > > > > > > > > > > *
> > > > > > > > > > > > >> > > > > > > > > > > @@ -1726,18 +1722,17 @@
> > > > > > > > > > > > >> > > > > > > > > > > * @param low the index of the
> > > first
> > > > > > > element,
> > > > > > > > > > > > >> inclusive,
> > > > > > > > > > > > >> > > to be
> > > > > > > > > > > > >> > > > > > > searched
> > > > > > > > > > > > >> > > > > > > > > > > * @param high the index of the
> > > last
> > > > > > > element,
> > > > > > > > > > > > >> inclusive,
> > > > > > > > > > > > >> > > to be
> > > > > > > > > > > > >> > > > > > > searched
> > > > > > > > > > > > >> > > > > > > > > > > */
> > > > > > > > > > > > >> > > > > > > > > > > - private static int
> > > > > > > findAnyZero(float[] a,
> > > > > > > > > > > int low,
> > > > > > > > > > > > >> > > int high) {
> > > > > > > > > > > > >> > > > > > > > > > > - while (true) {
> > > > > > > > > > > > >> > > > > > > > > > > + private static int
> > > > > > > findFirstZero(float[]
> > > > > > > > > a, int
> > > > > > > > > > > > >> low,
> > > > > > > > > > > > >> > > int high) {
> > > > > > > > > > > > >> > > > > > > > > > > + while (low < high) {
> > > > > > > > > > > > >> > > > > > > > > > > int middle = (low + high) >>> 1;
> > > > > > > > > > > > >> > > > > > > > > > > float middleValue = a[middle];
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > if (middleValue < 0.0f) {
> > > > > > > > > > > > >> > > > > > > > > > > low = middle + 1;
> > > > > > > > > > > > >> > > > > > > > > > > - } else if (middleValue > 0.0f) {
> > > > > > > > > > > > >> > > > > > > > > > > - high = middle - 1;
> > > > > > > > > > > > >> > > > > > > > > > > - } else { // middleValue == 0.0f
> > > > > > > > > > > > >> > > > > > > > > > > - return middle;
> > > > > > > > > > > > >> > > > > > > > > > > + } else { // middleValue >= 0.0f
> > > > > > > > > > > > >> > > > > > > > > > > + high = middle;
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > > + return low;
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > > }
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > Counting negative values
> > > appeared more
> > > > > > > > > expensive
> > > > > > > > > > > > >> than
> > > > > > > > > > > > >> > > any other
> > > > > > > > > > > > >> > > > > > > variants.
> > > > > > > > > > > > >> > > > > > > > > > > The last proposal seems to me as
> > > > > > > efficient
> > > > > > > > > as the
> > > > > > > > > > > > >> current
> > > > > > > > > > > > >> > > > > > > solution is in its worst case - when
> > > we have
> > > > > > > only one
> > > > > > > > > > > > >> negative
> > > > > > > > > > > > >> > > zero (in
> > > > > > > > > > > > >> > > > > > > the half of array).
> > > > > > > > > > > > >> > > > > > > > > > > And it shows the best result if we
> > > > > > > have many
> > > > > > > > > > > zeros.
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > >> > > > > > > > > > > Dmytro Sheyko
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > > > > > > > > > To: jjb at google.com;
> > > > > > > > > dmytro_sheyko at hotmail.com
> > > > > > > > > > > > >> > > > > > > > > > > > CC:
> > > core-libs-dev at openjdk.java.net;
> > > > > > > > > > > > >> iaroslavski at mail.ru
> > > > > > > > > > > > >> > > > > > > > > > > > Subject: Re[2]: New portion of
> > > > > > > > > improvements for
> > > > > > > > > > > > >> > > Dual-Pivot
> > > > > > > > > > > > >> > > > > > > Quicksort
> > > > > > > > > > > > >> > > > > > > > > > > > Date: Sun, 9 May 2010
> > > 23:51:27 +0400
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > Josh,
> > > > > > > > > > > > >> > > > > > > > > > > > Dmytro,
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > I have done more thoroughly
> > > testing
> > > > > > > "great -
> > > > > > > > > > > > >> less > 5 *
> > > > > > > > > > > > >> > > > > > > seventh" vs. "less < e1 && great > e5",
> > > > > > > > > > > > >> > > > > > > > > > > > and found that more
> > > symmetric code
> > > > > > > "less
> > > > > > > > > < e1 &&
> > > > > > > > > > > > >> > > great > e5"
> > > > > > > > > > > > >> > > > > > > is little bit faster, ~0.5..0.7%
> > > > > > > > > > > > >> > > > > > > > > > > > on both VMs. Other code has
> > > not been
> > > > > > > > > changed.
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > Please, take the latest
> > > version in
> > > > > > > > > attachment.
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > Vladimir
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > Tue, 4 May 2010 21:57:42 -0700
> > > > > > > письмо от
> > > > > > > > > Joshua
> > > > > > > > > > > > >> Bloch
> > > > > > > > > > > > >> > > > > > > <jjb at google.com>:
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > Vladimir,
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > Old:
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > >298 if (less < e1 && great
> > > > e5) {
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > New:
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > >256 if (great - less > 5 *
> > > > > seventh) {
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > >Regards,
> > > > > > > > > > > > >> > > > > > > > > > > > >Josh
```