brian.goetz at oracle.com
Wed May 18 14:57:24 UTC 2016
Great summary of the options.
For those who didn't read the whole thing:
- CE is bitwise equality -- "are these two things identical copies"
- OE is calling Object.equals()
- NE (for values) is the synthetic "recurse with == on primitive
components, NE on value components, and OE on reference components"
If it were 1995, and we were inventing Java (and we didn't have our
heads addled with an interpreter-based cost model), what would we do? I
think we'd bind ==(ref,ref) to OE, with an (uglier-named) API point for
CE (e.g., Objects.isSameReference) which would be used (a) for
known-interned things, (b) for IdentityHashMap, (c) as a default
implementation of Object.equals(), and (d) possibly as a
short-circuiting optimization *inside* overrides of equals().
This hypothetical world (call it J') still gives users the choice of CE
vs OE whenever they want, while nudging users towards OE (by giving it
the prime syntactic real estate) which is probably what they want most
of the time.
Why didn't we do this in 1995? Hard to know (I'll ask James next time I
see him), but I'd posit two main forces:
- C bias. Since C has *only* CE (and it was desirable to make Java
feel like "a safer C") it probably seemed like a big improvement already
to offer programmers both CE and OE on all references, and binding == to
OE probably seemed too radical at the time.
- Cost-model bias. In the Java 1.0 days, pointer comparison was
probably 100x faster in the interpreter than a virtual call to
Object.equals(). If binding == to OE was even considered, it was
probably deemed implausible.
Of course, both of these feel a bit silly 20 years later, but here we
are. So, in a J' world, what would we do with ==(val,val)? I think it
would be a no-brainer -- bind it to NE, since Java developers would
already associate == with a deeper comparison. Then we'd just have to
adjust whatever the API point for CE is to also accomodate CE on values,
and we'd be done.
But, we don't live in J' world. So our choices become:
P1: Bind ==(val,val) to CE, as we do with refs. Optimization challenges
with the usual (a==b || a.equals(b)) idiom , but the rules work the
same for values and refs.
P2: Bind ==(val,val) to NE. This is J' world for values and J world for
refs. (With even bigger optimization challenges for the (a==b ||
a.equals(b)) idiom.) Rules are different for values and refs, meaning
(a) users will have to keep in mind which world they're in, (b) when
migrating a class from ref to value they'll have to find and update all
equality comparisons (!), (c) writing code that's generic over values
and refs has to use an idiom that works on both, (d) when migrating code
from ref-generic to any-generic, inspect every equality comparison to
make sure it's still what was intended.
P3: Add a new equality operator. I've already been laughed at enough,
P4: Ban ==(val,val). This might be fine in value-only code, but it
complicates writing generic code, especially migrating generic code.
 John points out that if == is CE, then (a==b||a.equals(b)) will
redundantly load the fields on failed ==. But, many equals
implementations start with "a==b" as a short-circuiting optimization,
which means "a==b" will be a common (pure) subexpression in the
resulting expansion (and for values, methods are monomorphic and will
get inlined more frequently), so the two checks can be collapsed.
> Going back to op==, there are two plausible options for binding it to
> new types:
> (P1) Syntax of op==(val,val) and op==(any,any) binds to CE as with
> op==(ref,ref). Therefore, NE is uniformly reached by today's idiom,
> which traverses value fields twice.
> (P2) Syntax of op==(val,val) and op==(any,any) is direct access to
> NE. CE is reachable by experts at System.isEqualCopy. The old idiom
> for NE works also calls equals twice.
> (P3) Same as P1, op== is uniform access to CE. New op (spelled
> "===", ".==", "=~", etc.) is uniform, optimizable access to NE,
> attracting users away from legacy idiom for NE.
More information about the valhalla-spec-observers