RFE: 64 bit pointers needed
Senseney, Justin (NIH/CIT) [E]
senseneyj at mail.nih.gov
Thu May 10 21:48:34 UTC 2012
Title: RFE: 64 bit pointers needed
Author: Justin Senseney
Organization: National Institutes of Health
Owner: Justin Senseney
RFE: 4963452 (4850923, 4880587, 4088441, 6292967)
Discussion: compiler-dev at openjdk.java.net
As per the Java Language Specification, section 10.4, all array access in Java is done by using an int as index. Since an int is a signed 32bit value, this limits the total number of addressable elements of an array to 2**31 (about 2 billion). It should be possible to address an array using 64bit values.
Improved handling of large datasets that need to be stored in contiguous arrays.
Not changing existing range of Integer
Able to compile boolean a = new boolean[Long.MAX_VALUE];
While having access to 2 billion entries may seem sufficient, there are very compelling performance reasons to be able to use more in a single array. As an example, consider a square n*n matrix, stored as an array (either row or column major, doesn't matter which). Since an array stores at most 2**31 entries, this means that n=sqrt(2**31)=46341, thus the matrix cannot be very large. For multidimensional arrays this is an even more severe limitation (3d Tensors could at most be of size 1290).
The scope of this work is extensive, however the solution may be quite technically feasible.
A workaround is to use an array of arrays (ie. double). However there is no guarantee that successive rows will be laid of linearly in memory, and therefore performance may be severely penalized. Experimentally, performance may suffer by a factor of over 2, often far greater.
Also, most existing matrix packages (ie. LAPACK) assumes linear storage, and are thus incompatible with a double storage (requires double). Calling a LAPACK routine with a jagged storage thus requires extra array copying and memory allocation, and can further decrease performance and increase memory requirements.
It should be possible to address arrays using 64bit integers (long?), as this provides a seamless transition for users of 64bit computers.
Risks and Assumptions
Use of array of array constructs (use double instead of double) possible as workaround. This feature is well implemented in C/C++ without any problem, so should be quite technically feasible to implement.
My group has requested this feature for several years. It is currently listed as one of the top 25 RFEs on http://bugs.sun.com/top25_rfes.do. Please help Java maintain its relevance by implementing this. I have several image processing applications that are severely limited by this bug, these images cannot be opened in most Java applications. These include electron microscopy and micro-CT images where storage of a single slice requires more entries than allowable in a Java array.
Thank you for considering this RFE,
More information about the discuss