[records] Ancillary fields

Kevin Bourrillion kevinb at google.com
Wed Apr 18 18:46:47 UTC 2018

Lazy initialization is a massive pain to get right, so I'm very intrigued
by this proposal.

On Wed, Apr 18, 2018 at 10:58 AM, Brian Goetz <brian.goetz at oracle.com>

This is useful well beyond records.  For example, classes like `String`
> cache a lazily computed has code; these classes could just do
>     private int cacheHash = computeHashCode();
>     public int hashCode() { return cacheHash; }

('course, String itself may not use this, as it prefers to save memory by
just letting rare values like "drumwood boulderhead" be uncacheable.)

Ahh, you missed the `lazy` keyword on there :-) Which is good because it
raises an issue: when you forget it, bad performance may result without
other observable consequence. Although, it's already the case that reading
code like the above ought to raise all kinds of alarm bells (e.g., now I
want to go check which fields computeHashCode() might be referring to, and
where *they're* initialized), so I *should* be looking for that `lazy`
keyword to put my mind at ease. So maybe this is okay.

I assume that, unlike other field initializers, I'm safe to refer to
*any* other
field regardless of how and where that field is initialized. Right?

The intersection with primitives is interesting. I assume it gets secretly
created as an Integer? So there's a little extra hidden memory consumption.

For a reference type, what happens if the initialization produces `null`?
(I suggest throwing NPE, because I think the alternatives are worse?)

I pondered also allowing a method to be marked lazy (memoized, really) and
let the field(s) be created behind the scenes to store its result, but the
risk of that being applied to an impure method is probably too scary.

On 4/13/2018 1:15 PM, Kevin Bourrillion wrote:
> As one of the voices demanding we allow ancillary fields, I can confirm
> that I had only these derived-state use cases in mind. I don't see anything
> else as legitimate. That is, I think that the semantic invariants you're
> trying to preserve for records are worth fighting for, and additional
> *non-derived* state would violate them.
> On Fri, Apr 13, 2018 at 9:46 AM, Brian Goetz <brian.goetz at oracle.com>
> wrote:
>> Let's see if we can make some progress on the elephant in the room --
>> ancillary fields.  Several have expressed the concern that without the
>> ability to declare some additional instance state, the feature will be too
>> limited.
>> The argument in favor of additional fields is the obvious one; more
>> classes can be records.  And there are some arguably valid use cases for
>> additional fields that don't conflict with the design center for records.
>> The best example is derived state:
>>  - When a field is a cached property derived from the record state (such
>> as how String caches its hashCode)
>> Arguably, if a field is derived deterministically from immutable record
>> state, then it is not creating any new record state.  This surely seems
>> within the circle.
>> The argument against is more of a slippery-slope one; I believe
>> developers would like to view this feature through the lens of syntactic
>> boilerplate, rather than through semantics.  If we let them, they would
>> surely and routinely do the following:
>>     record A(int a, int b) {
>>         private int c;
>>         public A(int a, int b, int c) {
>>             this(a, b);
>>             this.c = c;
>>         }
>>         public boolean equals(Object other) {
>>             return default.equals(other) && ((A) other).c == c;
>>         }
>>     }
>> Here, `c` is surely part of the state of `A`.  And, they wouldn't even
>> know what they'd lost; they would just assume records are a way of
>> "kickstarting" a class declaration with some public fields, and then you
>> can mix in whatever private state you want.
>> Why is this bad?  While "reduced-boilerplate classes" is a valid feature
>> idea, our design goal for records is much more than that. The semantic
>> constraints on records are valuable because they yield useful invariants;
>> that they are "just" their state vector, that they can be freely taken
>> apart and put back together with no loss of information, and hence can be
>> freely serialized/marshaled to JSON and back, etc.
>> We currently prohibit records like `A` via a number of restrictions: no
>> additional fields, no override of equals.  We don't need all of these
>> restrictions to achieve the desired goal, but we also can't relax them all
>> without opening the gate.  So we should decide carefully which we want to
>> relax, as making the wrong choice constrains us in the future.
>> Before I dive into details of how we might extend records to support the
>> case of "cached derived state", I'd like to first come to some agreement
>> that this covers the use cases that we think fall into the "legitimate"
>> uses of additional fields.
>> On 3/16/2018 2:55 PM, Brian Goetz wrote:
>>> There are a number of potentially open details on the design for
>>> records.  My inclination is to start with the simplest thing that preserves
>>> the flexibility and expectations we want, and consider opening up later as
>>> necessary.
>>> One of the biggest issues, which Kevin raised as a must-address issue,
>>> is having sufficient support for precondition validation. Without
>>> foreclosing on the ability to do more later with declarative guards, I
>>> think the recent construction proposal meets the requirement for
>>> lightweight enforcement with minimal or no duplication.  I'm hopeful that
>>> this bit is "there".
>>> Our goal all along has been to define records as being “just macros” for
>>> a finer-grained set of features.  Some of these are motivated by
>>> boilerplate; some are motivated by semantics (coupling semantics of API
>>> elements to state.)  In general, records will get there first, and then
>>> ordinary classes will get the more general feature, but the default answer
>>> for "can you relax records, so I can use it in this case that almost but
>>> doesn't quite fit" should be "no, but there will probably be a feature
>>> coming that makes that class simpler, wait for that."
>>> Some other open issues (please see my writeup at
>>> http://cr.openjdk.java.net/~briangoetz/amber/datum.html for reference),
>>> and my current thoughts on these, are outlined below. Comments welcome!
>>>  - Extension.  The proposal outlines a notion of abstract record, which
>>> provides a "width subtyped" hierarchy.  Some have questioned whether this
>>> carries its weight, especially given how Scala doesn't support case-to-case
>>> extension (some see this as a bug, others as an existence proof.)  Records
>>> can implement interfaces.
>>>  - Concrete records are final.  Relaxing this adds complexity to the
>>> equality story; I'm not seeing good reasons to do so.
>>>  - Additional constructors.  I don't see any reason why additional
>>> constructors are problematic, especially if they are constrained to
>>> delegate to the default constructor (which in turn is made far simpler if
>>> there can be statements ahead of the this() call.) Users may find the lack
>>> of additional constructors to be an arbitrary limitation (and they'd
>>> probably be right.)
>>>  - Static fields.  Static fields seem harmless.
>>>  - Additional instance fields.  These are a much bigger concern. While
>>> the primary arguments against them are of the "slippery slope" variety, I
>>> still have deep misgivings about supporting unrestricted non-principal
>>> instance fields, and I also haven't found a reasonable set of restrictions
>>> that makes this less risky.  I'd like to keep looking for a better story
>>> here, before just caving on this, as I worry doing so will end up biting us
>>> in the back.
>>>  - Mutability and accessibility.  I'd like to propose an odd choice
>>> here, which is: fields are final and package (protected for abstract
>>> records) by default, but finality can be explicitly opted out of
>>> (non-final) and accessibility can be explicitly widened (public).
>>>  - Accessors.  Perhaps the most controversial aspect is that records are
>>> inherently transparent to read; if something wants to truly encapsulate
>>> state, it's not a record.  Records will eventually have pattern
>>> deconstructors, which will expose their state, so we should go out of the
>>> gate with the equivalent.  The obvious choice is to expose read accessors
>>> automatically.  (These will not be named getXxx; we are not burning the
>>> ill-advised Javabean naming conventions into the language, no matter how
>>> much people think it already is.)  The obvious naming choice for these
>>> accessors is fieldName().  No provision for write accessors; that's
>>> bring-your-own.
>>>  - Core methods.  Records will get equals, hashCode, and toString.
>>> There's a good argument for making equals/hashCode final (so they can't be
>>> explicitly redeclared); this gives us stronger preservation of the data
>>> invariants that allow us to safely and mechanically snapshot / serialize /
>>> marshal (we'd definitely want this if we ever allowed additional instance
>>> fields.)  No reason to suppress override of toString, though. Records could
>>> be safely made cloneable() with automatic support too (like arrays), but
>>> not clear if this is worth it (its darn useful for arrays, though.)  I
>>> think the auto-generated getters should be final too; this leaves arrays as
>>> second-class components, but I am not sure that bothers me.
> --
> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

More information about the amber-spec-observers mailing list