Extending Java Arrays/Collection Sort API

Laurent Bourgès bourges.laurent at gmail.com
Wed Nov 28 20:15:06 UTC 2018

Hi John & Brian,

Thank you for your feedback finally, we almost agree Java Sort API could be
improved, in jdk13 possibly.

> Doug is right that there is an enormous potential list of sort methods,
> and we can't include them all.  But I do miss the ability to extract
> indexes instead of sorting the array itself.
> If you read my first email 11.14, I asked if you ever agree exposing few
other sort algorithms to have a public sort API useful on special cases
(isort, radix, merge sorts) when needed to manage specific types of
Never mind: third-party libraries can provide advanced / other algorithms.

Vladimir Yaroslavskiy made huge progresses on his DPQS 18.11 and I hope it
will be the overall winner on almost all data distributions at all scale
soon to provide a better Arrays.sort() implementation soon.

> Or, slightly more generally, sorting an int[] or long[] array with a
> comparator.
> Sometimes you don't want an object per sorted entity; you want some kind
> of oracle function that can compare two indexes into a non-concrete
> (virtualized) set of values; the sort operation would output the reordered
> ints.
> Example:  Sorting indexes into a large text file according to some
> algorithm
> such as Burrows-Wheeler.  You don't want to make all the substrings and
> put them into a String[] array, and you don't even want to make
> CharSequence
> objects that view into the text file.  You just want indexes into the text
> file to
> be sorted.
> IMO, the very simplest answer with significantly better semantics is:
>   void sort(int[] a, (int[] a, int fromIndex, int toIndex,
> IntBinaryOperator cmp);
> And maybe add parallelSort, and use a new IntComparator instead of
> repurposing IntBinaryOperator.
> Extracting the permutation vector of indexes from an array sorting
> operation
> would then look like this:
>     public static <T> int[] sortToIndexes(T[] a, Comparator<? super T> c) {
>         int[] perm = new int[a.length];  // starts as an iota vector
>         for (int i = 0; i < a.length; i++)  perm[i] = i;
>         sort(perm, 0, a.length, (i, j) -> c.compare(a[i], a[j]));  // NEW
>         return perm;
>     }
> But there are other use cases for comparator-based sorting of indexes,
> which start not with an iota vector, but with an array of index-like values
> which may index into something other than an array (like a text file,
> which is why 'long[] a' might be useful too).
> In Valhalla many such use cases can be covered by using a flat array of
> values which encapsulate the integer (or long) indexes, and sorting that.
> The array of wrapped primitives will look like Object[] and so the existing
> API point with a comparator would allow the flat array of ints (dressed as
> value Objects) to be sorted according to any ad hoc ordering rule,
> including
> treating them as indexes.
> So maybe the answer is "wait for Valhalla".  Still, the lack of a
> comparator
> on primitive arrays is an unnecessary limitation, which excludes
> index-based
> computations from the portfolio of our sort methods.

1. I am waiting for Value types for several years as Marlin renderer is
already using array of structs in java, ie edge data are packed in int[]
and unsafe buffers, to avoid bound checks (like C). It is a good candidate
to evaluate this new feature.
I can help here, but not alone.

2. John, your proposal matches mine:
"alternatively, it could be expressed as Comparator<Primitive>
Arrays.sort(int[] a, low, high, Comparator<int, int> cmp) that sorts array
a using the given comparator. For example, a[] could be edge indices sorted
according to their edge x position...
This comparator is specific as it compares anything (external storage) at
given indices i,j."

Let's go for 13...

If I may insist a bit on the 2 arrays variant, Marlin needs the x array
sorted for future traversal...
I do not know what is faster: sorting 2 arrays (indices swapped according
to x array order: 2x times swaps) or using a comparator (lookups to compare
(x[i], x[j] ) + x array reconstruction).

Finally how can we manage such improvement in 13? CSR / RFE ?

John or Brian, could you lead that specification request (CSR) and formal
process ?
 I can propose implementations on my side, I already have a working DPQS
18.11 with 2 arrays on Marlin's github.


More information about the core-libs-dev mailing list