Project Proposal: Trinity

Paul Sandoz paul.sandoz at
Tue Nov 22 00:11:22 UTC 2016

Hi Karthik,

Thanks for sending this. Some thoughts.

I can see a number of DAX API focused explorations here:

1) A DAX-specific API bound to libdax using JNI
2) A DAX-specific API bound to libdax using Panama
3) A DAX-like API leveraging technologies in either 1) or 2)

Each may allow one to get the most out of a DAX accelerator.

I think 2) and 3) are complimentary to efforts in Panama.

3) is where alternative implementations leveraging SIMDs and GPUs might also be a good fit.

As one goes further down the abstraction road it gets a little fuzzier and there may be duplication and IMO we should be vigilant and consider consolidating particular aspects in such cases.

And as one goes further down the abstraction road, to say, i believe the problem gets much harder. The set of valid pipelines that might map to DAX operations, and further might map efficiently, is likely to be quite small and to the developer the performance model unclear. Cracking lambdas is certainly not easy, and is likely to be costly as well. To some extent project Sumatra ran into such difficulties, although i think in your case the problem is a little easier than that Sumatra is trying to solve. Still, it’s not easy to detect and translate appropriate pipelines into another form.

As i understand it DAX provides a number of fairly simple bulk transformation operations over arrays of data, with some flexibility in the element layout of that data. Focusing an API on those operations and layouts is likely to be a more tractable problem. That might include off-heap memory with compatible panama layouts, or on-heap somehow compatible with layouts for simple value types. Cue hand-waving :-) but in the spirit of 3) this might be the sweet spot.


> On 14 Nov 2016, at 08:23, Karthik Ganesan <karthik.ganesan at> wrote:
> Hi,
> I would like to propose the creation of a new Project: Project Trinity.
> This Project would explore enhanced execution of bulk aggregate calculations over Streams through offloading calculations to hardware accelerators.
> Streams allow developers to express calculations such that data parallelism can be efficiently exploited. Such calculations are prime candidates for leveraging enhanced data-oriented instructions on CPUs (such as SIMD instructions) or offloading to hardware accelerators (such as the SPARC Data Accelerator co-processor, further referred to as DAX [1]).
> To identify a path to improving performance and power efficiency, Project Trinity will explore how libraries like Streams can be enhanced to leverage data processing hardware features to execute Streams more efficiently.
> Directions for exploration include:
> - Building a streams-like library optimized for offload to
> -- hardware accelerators (such as DAX), or
> -- a GPU, or
> -- SIMD instructions;
> - Optimizations in the Graal compiler to automatically transform suitable Streams pipelines, taking advantage of data processing hardware features;
> - Explorations with Project Valhalla to expand the range of effective acceleration to Streams of value types.
> Success will be evaluated based upon:
> (1) speedups and resource efficiency gains achieved for a broad range of representative streams calculations under offload,
> (2) ease of use of the hardware acceleration capability, and
> (3) ensuring that there is no time or space overhead for non-accelerated calculations.
> Can I please request the support of the Core Libraries Group as the Sponsoring Group with myself as the Project Lead.
> Warm Regards,
> Karthik Ganesan
> [1]

More information about the discuss mailing list