valhalla-dev Digest, Vol 6, Issue 3
peter.levart at gmail.com
Fri Dec 26 10:45:58 UTC 2014
My thoughts on sentinels, etc...
On 12/26/2014 02:28 AM, Thomas W wrote:
> There is much code already -- Map.get() -- returning a sentinel for "not
> found". My proposal is to expressly recognize that; and ensure that better
> values than 0 are used.
> My initial proposal was to use -1; not perfect I agree, but the bulk of C
> library APIs has been happy to do without this number for a very long time
> :) However, I'm happy to revise it to Integer.MIN_VALUE. 'double' could use
> either MIN_VALUE or NaN. 'boolean' will have false, and that's the best we
> can do.
> Since efficiency is the main reason to specialize, we shouldn't require
> Map.get()'s single-word return to be "blown out" to two; nor should
> wrapping in Optional be required.
If value types are done right, then "wrapping" a value type inside
another value type is not actually wrapping (like we are used to call it
with reference types), but embedding. The return of Optional<int> or any
Optional<ValueType> does not represent an implementation overhead of
another indirection like with reference types. If all goes well, then I
think the plan is for "blowing out" a single word int return to double
word Optional<int> means the additional word will be assigned to another
CPU register in machine code - hardly something we can measure.
> Unless we're going to drop Map.get() from the API, which we shouldn't, a
> sentinel is proveably the only solution.
> As with Objects, anybody who actually needs to distinguish
> Integer.MIN_VALUE from "not present" in Map<?,int> can use the same
> approaches as present:
> 1) check containsKey() first, then call get()
Not good for ConcurrentMap(s).
> 2) use getOrDefault()
Unless all the values are needed.
> 3) "mask" the sentinel with another value.
> Usage of the sentinel (for single-return APIs, or -- less recommended --
> for internal storage) will be up to the collection author.
I think that the desire to use a sentinel for internal storage in
value-type based data structures such as collections is greater than
just for returns from get() or such which are not problematic at all if
Optional<ValueType> is used instead. Take for example a HashMap<any K,
any V>. What would be the most efficient representation of specialized
HashMap<int, int> for example? Perhaps a linear-scan hash table
implemented as Map.Entry<any K, any V> array. So what is the most
efficient representation of specialized Map.Entry<int, int> ? Here,
using a sentinel value for "not-present" key can save 1/3 or even 1/2
(depends on alignment constraints on various architectures) of memory
used for such a structure. With larger value types as keys and/or
values, additional flag in Map.Entry implementation does not matter much
any more. So we might need two implementations - with and without
> Given the efficiency of a single machine-word, rather than two, and the
> absence of multiple-returns or out-parameters from the Java language, this
> solution is sane, efficient & suitable for performant use.
But value types as return types "are" in effect multiple-returns,
packaged in a disguise of a class-like thing.
More information about the valhalla-dev