<html><head><meta content="text/html; charset=UTF-8" http-equiv="content-type"><style type="text/css">@import url('https://themes.googleusercontent.com/fonts/css?kit=dpiI8CyVsrzWsJLBFKehGpLhv3qFjX7dUn1mYxfCXhI');.lst-kix_oeiab5nhup5a-5>li:before{content:"\0025a0 "}.lst-kix_oeiab5nhup5a-6>li:before{content:"\0025cf "}.lst-kix_oeiab5nhup5a-3>li:before{content:"\0025cf "}.lst-kix_oeiab5nhup5a-4>li:before{content:"\0025cb "}.lst-kix_oeiab5nhup5a-7>li:before{content:"\0025cb "}.lst-kix_oeiab5nhup5a-8>li:before{content:"\0025a0 "}ul.lst-kix_oeiab5nhup5a-8{list-style-type:none}.lst-kix_oeiab5nhup5a-0>li:before{content:"\0025cf "}ul.lst-kix_oeiab5nhup5a-7{list-style-type:none}ul.lst-kix_oeiab5nhup5a-4{list-style-type:none}.lst-kix_oeiab5nhup5a-1>li:before{content:"\0025cb "}.lst-kix_oeiab5nhup5a-2>li:before{content:"\0025a0 "}ul.lst-kix_oeiab5nhup5a-3{list-style-type:none}ul.lst-kix_oeiab5nhup5a-6{list-style-type:none}ul.lst-kix_oeiab5nhup5a-5{list-style-type:none}ul.lst-kix_oeiab5nhup5a-0{list-style-type:none}ul.lst-kix_oeiab5nhup5a-2{list-style-type:none}ul.lst-kix_oeiab5nhup5a-1{list-style-type:none}ol{margin:0;padding:0}table td,table th{padding:0}.c0{margin-left:36pt;padding-top:0pt;padding-left:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:left}.c13{padding-top:18pt;padding-bottom:6pt;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}.c6{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Roboto";font-style:normal}.c3{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Courier New";font-style:normal}.c17{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:20pt;font-family:"Arial";font-style:normal}.c2{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial";font-style:normal}.c9{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:16pt;font-family:"Arial";font-style:normal}.c7{padding-top:20pt;padding-bottom:6pt;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}.c1{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:left}.c19{color:#000000;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Courier New";font-style:normal}.c14{text-decoration-skip-ink:none;-webkit-text-decoration-skip:none;color:#1155cc;text-decoration:underline}.c4{background-color:#ffffff;font-family:"Roboto";font-weight:700}.c15{font-weight:400;font-family:"Courier New"}.c20{padding:0;margin:0}.c10{font-family:"Roboto";font-weight:400}.c18{max-width:468pt;padding:72pt 72pt 72pt 72pt}.c16{color:inherit;text-decoration:inherit}.c11{font-weight:700}.c8{height:11pt}.c5{background-color:#ffffff}.c12{font-style:italic}.title{padding-top:0pt;color:#000000;font-size:26pt;padding-bottom:3pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}.subtitle{padding-top:0pt;color:#666666;font-size:15pt;padding-bottom:16pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}li{color:#000000;font-size:11pt;font-family:"Arial"}p{margin:0;color:#000000;font-size:11pt;font-family:"Arial"}h1{padding-top:20pt;color:#000000;font-size:20pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h2{padding-top:18pt;color:#000000;font-size:16pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h3{padding-top:16pt;color:#434343;font-size:14pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h4{padding-top:14pt;color:#666666;font-size:12pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h5{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h6{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;font-style:italic;orphans:2;widows:2;text-align:left}</style></head><body class="c5 c18"><h1 class="c7" id="h.l15u8dnuujsp"><span class="c17">Record Proposal Feedback Based On @AutoValue</span></h1><p class="c1"><span>The most recent version of the </span><span class="c14"><a class="c16" href="https://www.google.com/url?q=https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001028.html&sa=D&ust=1554247827696000">records proposal</a></span><span> puts </span><span class="c2">many restrictions on records, with the aim of producing a more focused, opinionated tool. Most notably: record fields must be final; record classes must be final; field accessors must be public. There has broadly been support for these ideas, from Google and from other JDK contributors: it appeals to our sensibilities. However, there have also been some questions asked about whether the restrictions imposed will hinder adoption, and it’s hard to estimate that in the abstract.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>Google would like to ensure Java gets the best possible version of the record feature, and so in addition to </span><span class="c14"><a class="c16" href="https://www.google.com/url?q=https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001041.html&sa=D&ust=1554247827697000">thinking abstractly about what features sound good for records</a></span><span class="c2">, I have spent some time collecting concrete data from the Google codebase, to determine how well a version of records would fit in there, and what changes to the records proposal might make it better.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>A natural starting point is to compare with </span><span class="c14"><a class="c16" href="https://www.google.com/url?q=https://github.com/google/auto/blob/master/value/userguide/index.md&sa=D&ust=1554247827697000">@AutoValue</a></span><span class="c2">, an annotation processor Google publishes and uses to auto-generate implementations for simple data classes. We use these internally for roughly the same kinds of things records might be used for, and so data about how they are used will help when evaluating the impact of various facets of the records proposal. Accordingly, I have collected some statistics about all the @AutoValue classes in Google’s codebase.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c2">Below I share this data, and present some recommendations based on it. When it is useful and possible, I also include simplified and fictionalized code samples based on the real code in Google’s codebase, to clarify what patterns I am talking about.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c2">One additional thing I would have liked to do is to somehow find classes which were “almost” records, but which ended up not using @AutoValue. A survey of these might help us decide what changes we could make to increase adoption, or simply confirm for us, “Yes, it’s a good thing we included restriction X, because this class is a bad candidate for a record, but without restriction X it might have become a record”. Alas, unsurprisingly, it is much easier to find actual @AutoValue classes than to design a heuristic for “almost an @AutoValue”, and so I have not done this.</span></p><h2 class="c13" id="h.l50yr4hol5vu"><span>Summary and Recommendations (TL;DR)</span></h2><p class="c1"><span class="c2">For those who don’t care to slog through the whole document, I present here a summary of recommendations. Many of these recommendations simply echo what is already in the proposal, because the data I found supports the proposal.</span></p><ul class="c20 lst-kix_oeiab5nhup5a-0 start"><li class="c0"><span class="c2">Records can expect to be about as common as enums</span></li><li class="c0"><span class="c2">Records should be allowed to implement interfaces, and perhaps to extend a superclass, but should not allow their own subclasses (except perhaps abstract records)</span></li><li class="c0"><span class="c2">Records should have a very lightweight syntax for the common case, preferably one line. @AutoValue’s clunkier syntax may be reducing adoption for some use cases.</span></li><li class="c0"><span class="c2">Records should be immutable. The language should enforce shallow mutability (by making all fields final), and style guides should recommend that deep immutability. </span></li><li class="c0"><span class="c2">We should consider adding an alternative way to construct records besides a constructor with positional parameters. Builders are popular for @AutoValue, but perhaps something better could be done for records as a language feature.</span></li><li class="c0"><span class="c2">Records do not need language-level support for withFoo methods, or toBuilder, even if builder support is included.</span></li><li class="c0"><span>Records do not need a way to include private/hidden state, or to memoize derived properties</span></li><li class="c0"><span>Records should allow implementing Object methods by hand, rejecting the auto-generated implementation, but expect this to be done rarely</span></li></ul><h2 class="c13" id="h.eo4fjzp9jyls"><span class="c9">Popularity</span></h2><p class="c1"><span>One simple question to ask is: how often would records be used? If records end up being used very rarely, we may regret spending too much time designing and implementing them, or may wish we had made them more flexible instead of more restrictive. To measure popularity, I simply asked “what percentage of all named, non-local classes are @AutoValue classes?” I limit to named, non-local classes because these are the </span><span>scopes</span><span> in which @AutoValue is able to operate; a record integrated more tightly with the language could replace local classes, but @AutoValue can’t. The answer is: </span><span class="c11">3.0% of Google’s named, concrete classes are @AutoValue classes</span><span>. Is that a lot, or a little? In isolation it’s hard to say. For comparison, I asked how many of Google’s classes fall into other interesting categories. At two opposite ends of the spectrum, </span><span class="c11">8.9% of classes are anonymous</span><span>, while just </span><span class="c11">0.05% of classes are method-local named classes</span><span>. A particularly promising comparison is to enums, a feature which similarly gives up some flexibility for increased expressive power, and to which Brian refers in the records proposal. Of Google’s named, concrete classes, </span><span class="c11">3.7%</span><span class="c2"> of them are enums.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c2">So, it seems we are past the first hurdle: there is a healthy appetite for “simple immutable data carriers”, if we make a compelling offering. We can expect them to be less popular than anonymous classes, but roughly as popular as enums. Or perhaps more popular: quite possibly, developers would be more keen to use records if they were a language feature instead of a Google library. It’s hard to know for sure.</span></p><h2 class="c13" id="h.1j04x5ersb6d"><span class="c9">Superclasses</span></h2><p class="c1"><span>How often do @AutoValue classes make use of inheritance? </span><span class="c4">77% </span><span class="c10 c5">of</span><span class="c10 c5"> @AutoValue classes are “islands” in the inheritance graph: they </span><span class="c4">extend Object, and implement no interfaces</span><span class="c10 c5">. </span><span class="c4">15%</span><span class="c10 c5"> of @AutoValue classes </span><span class="c4">extend Object and implement exactly one interface</span><span class="c10 c5">. A mere </span><span class="c4">4% extend some class other than Object, and implement no interfaces</span><span class="c6 c5">. Very few do anything else (implement 2+ interfaces, or implement interface(s) while also extending a class).</span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c5 c10">Just like @AutoValue, the current proposal plans to allow implementing interfaces and/or extending a class. That seems reasonable, but extending a class is rare enough that we could consider forbidding it if that fits better onto the semantic goals of “simple data carriers”: it won’t harm too large a percentage of the usages. Perhaps, for example, we could reserve the </span><span class="c10 c5 c12">extends</span><span class="c6 c5"> syntax for the future possibility of extending “abstract records” that Brian suggested. Still, it would be reasonable to allow normal subclassing, if we judge that it helps achieve the semantic goals of records.</span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c6 c5">One possible source of skew in this data: since @AutoValue determines its fields by identifying abstract 0-argument methods, rather than by explicitly listing them, some @AutoValue classes “inherit their API” by extending an abstract class, or implementing an interface, containing some abstract accessor methods. I don’t have a good estimate for how common this is. Such uses are perhaps arguments in favor of the “extending abstract records” feature. However, it is also interesting for a non-record class to conform to the same interface. Consider, for example, this pattern I have seen a few times:</span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c3 c5">public interface JobInfo {</span></p><p class="c1"><span class="c3 c5"> String session();</span></p><p class="c1"><span class="c15 c5"> </span><span class="c15 c5">boolean</span><span class="c3 c5"> privileged();</span></p><p class="c1"><span class="c3 c5"> Instant startTime();</span></p><p class="c1"><span class="c3 c5">}</span></p><p class="c1 c8"><span class="c3 c5"></span></p><p class="c1"><span class="c3 c5">@AutoValue public abstract class FakeJobInfo implements JobInfo {</span></p><p class="c1"><span class="c15 c5"> // </span><span class="c11 c5 c19">Note no fields specified: they are inherited from JobInfo!</span></p><p class="c1"><span class="c3 c5"> Builder builder() {return new AutoValue_FakeJobInfo.Builder();}</span></p><p class="c1"><span class="c3 c5"> @AutoValue.Builder public interface Builder {</span></p><p class="c1"><span class="c3 c5"> Builder setSession(String session);</span></p><p class="c1"><span class="c3 c5"> Builder setPrivileged(boolean privileged);</span></p><p class="c1"><span class="c3 c5"> Builder setStartTime(Instant startTime);</span></p><p class="c1"><span class="c3 c5"> FakeJobInfo build();</span></p><p class="c1"><span class="c3 c5"> }</span></p><p class="c1"><span class="c3 c5">}</span></p><p class="c1 c8"><span class="c3 c5"></span></p><p class="c1"><span class="c3 c5">public class JobRegistry {</span></p><p class="c1"><span class="c3 c5"> private Database database;</span></p><p class="c1"><span class="c3 c5"> // ...</span></p><p class="c1"><span class="c3 c5"> private class JobInfoImpl implements JobInfo {</span></p><p class="c1"><span class="c3 c5"> private JobId id;</span></p><p class="c1"><span class="c3 c5"> public JobInfoImpl(JobId id) {this.id = id;}</span></p><p class="c1"><span class="c3 c5"> public String session() {return database.lookupSession(id);}</span></p><p class="c1"><span class="c3 c5"> public boolean privileged() {return database.isPrivileged(id);}</span></p><p class="c1"><span class="c3 c5"> public Instant startTime() {</span></p><p class="c1"><span class="c3 c5"> return Instant.ofEpochSecond(database.startTime(id));</span></p><p class="c1"><span class="c3 c5"> }</span></p><p class="c1"><span class="c3 c5"> }</span></p><p class="c1"><span class="c3 c5">}</span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c6 c5">Of course I have simplified the logic in JobInfoImpl, but the idea is that there is an interface for looking things up, a fake implementation (used in tests) that is a simple record, and a non-record implementation for production use that gets its data from some other source.</span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c10 c5">I think the records proposal as written already supports this use case semantically: we can define the interface first, then define a record that implements it. However, we don’t get any code for free: we have to repeat the list of properties, defining them once in the interface and then again in the record’s list of fields. One interesting possibility would be some connection between records and interfaces. </span><span class="c4">Perhaps a record definition could, upon request, also produce an interface that can be conformed to by some non-record implementation</span><span class="c6 c5">. Alternatively, a record could inherit fields based on the interfaces it implements, as @AutoValue does; however, since records do not normally have their fields defined via abstract methods, I think this approach fits less well for records than it does for @AutoValue.</span></p><h2 class="c13" id="h.q8e6tf50pn0l"><span>Subclasses</span></h2><p class="c1"><span class="c10 c5">Another inheritance-related question: should subclasses of records be allowed? Google’s @AutoValue documentation strongly discourages this, but we cannot make it illegal because @AutoValue uses subclassing as an implementation detail (it generates a subclass of your abstract class). So, we can ask how many authors decided to write additional subclasses despite the warnings. Just </span><span class="c4">0.26% of @AutoValue classes have subclasses</span><span class="c10 c5"> (aside from the auto-generated one that we expect). An inspection of some of these instances doesn’t suggest a compelling case for subclassing of records. These subclasses are mostly stubs for testing: for example, overriding accessors to throw an exception, or to return a value that could have just been specified in the constructor. The authors do not seem to be intentionally working around some </span><span class="c10 c5">restriction</span><span class="c6 c5"> of @AutoValue. </span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c5 c6">One example:</span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c3 c5">@AutoValue public abstract class Document {</span></p><p class="c1"><span class="c3 c5"> public abstract String text();</span></p><p class="c1"><span class="c3 c5"> public abstract Language language();</span></p><p class="c1"><span class="c3 c5"> // and 4 more fields...</span></p><p class="c1"><span class="c3 c5">}</span></p><p class="c1 c8"><span class="c3 c5"></span></p><p class="c1"><span class="c3 c5">public class DocumentTestHelper {</span></p><p class="c1"><span class="c3 c5"> public static Document instance() {return INSTANCE;}</span></p><p class="c1"><span class="c3 c5"> private static final Document INSTANCE = new ThrowingDocument();</span></p><p class="c1"><span class="c3 c5"> private static class ThrowingPoint extends Point {</span></p><p class="c1"><span class="c3 c5"> private UnsupportedOperationException cannotDoThis() {</span></p><p class="c1"><span class="c3 c5"> return new UnsupportedOperationException("Cannot use this!");</span></p><p class="c1"><span class="c3 c5"> }</span></p><p class="c1"><span class="c3 c5"> @Override public String text() {throw cannotDoThis();}</span></p><p class="c1"><span class="c3 c5"> @Override public Language language() {throw cannotDoThis();}</span></p><p class="c1"><span class="c3 c5"> // and 4 more fields doing the same thing...</span></p><p class="c1"><span class="c3 c5"> }</span></p><p class="c1"><span class="c3 c5">}</span></p><p class="c1 c8"><span class="c6 c5"></span></p><p class="c1"><span class="c10 c5">The rarity of subclasses (and lack of any convincing subclasses) argues in favor of the restriction that </span><span class="c4">all records must be final </span><span class="c6 c5">(except, perhaps, some kind of abstract record).</span></p><h2 class="c13" id="h.shvjpt22f2s4"><span class="c9">Visibility and Scoping</span></h2><p class="c1"><span>I wanted to know whether people are using @AutoValue mostly for public API “contracts”, or for internal implementation details. To answer this question, I asked of each @AutoValue class whether it a nested class or top-level, and what visibility modifier it declares. This misses some subtlety: I didn’t pay attention to </span><span class="c12">effective visibility</span><span class="c2">, so a public @AutoValue nested inside a package-private class would look public in this analysis.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c11">62%</span><span> of @AutoValue classes are top-level classes, of which </span><span class="c11">84%</span><span> are public; the rest are necessarily package-private. The remaining </span><span class="c11">38%</span><span> of @AutoValue classes are nested classes, divided evenly between public and package-private (none are protected or private, because @AutoValue does not support such visibilities). Thus, </span><span class="c11">almost 75% of @AutoValue classes are public</span><span class="c2">, suggesting they are used as part of some contract between two or more classes. </span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c2">This is a bit higher than I would have guessed. I suspect it is because while @AutoValue does a good job of meeting the “semantic goals” of records, it still has a fair amount of boilerplate, and developers do not like to go to the trouble of defining one for one-off data types used within a method. Consider Brian’s topThreePeople example. I have reproduced it below for convenience, and included an alternate implementation using @AutoValue:</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c3">public class PersonDatabase {</span></p><p class="c1"><span class="c3"> List<Person> topThreePeopleUsingRecord(List<Person> list) {</span></p><p class="c1"><span class="c3"> record PersonX(Person p, int hash) {</span></p><p class="c1"><span class="c3"> PersonX(Person p) {</span></p><p class="c1"><span class="c3"> this(p, p.name().toUpperCase().hashCode());</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3"> return list.stream()</span></p><p class="c1"><span class="c3"> .map(PersonX::new)</span></p><p class="c1"><span class="c3"> .sorted(Comparator.comparingInt(PersonX::hash))</span></p><p class="c1"><span class="c3"> .limit(3)</span></p><p class="c1"><span class="c3"> .map(PersonX::p)</span></p><p class="c1"><span class="c3"> .collect(toList());</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3"> @AutoValue abstract static class HashedPerson {</span></p><p class="c1"><span class="c3"> public abstract Person p();</span></p><p class="c1"><span class="c3"> public abstract int hash();</span></p><p class="c1"><span class="c3"> public static HashedPerson create(Person p) {</span></p><p class="c1"><span class="c3"> return new AutoValue_PersonDatabase_HashedPerson(p, p.name().toUpperCase().hashCode());</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3"> List<Person> topThreePeopleUsingAutoValue(List<Person> list) {</span></p><p class="c1"><span class="c3"> return list.stream()</span></p><p class="c1"><span class="c3"> .map(HashedPerson::create)</span></p><p class="c1"><span class="c3"> .sorted(Comparator.comparingInt(HashedPerson::hash))</span></p><p class="c1"><span class="c3"> .limit(3)</span></p><p class="c1"><span class="c3"> .map(HashedPerson::p)</span></p><p class="c1"><span class="c3"> .collect(toList());</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1"><span class="c3">}</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>The @AutoValue “record” takes 7 lines to declare instead of 5, requires you to look up or remember the special naming scheme it uses, and also “leaks” into the enclosing class from the method where it really belongs. </span><span class="c11">If we want records to be more useful as implementation details, we should ensure there is a very low-overhead way of defining them</span><span class="c2">. Brian’s current proposal is promising in this regard, allowing a simple one-line definition for lightweight records.</span></p><h2 class="c13" id="h.ynacwdiz3l9s"><span class="c9">Mutability</span></h2><p class="c1"><span>The most recent </span><span>restriction</span><span> proposed for records is that all fields must be final. Google broadly encourages immutability, and so we support this idea, but can we prove that developers agree? It’s hard to collect unbiased data on this: since @AutoValue doesn’t define </span><span class="c12">fields</span><span>, but rather defines </span><span class="c12">named accessors</span><span class="c2"> and hides the fields from you, there is no way for a developer to say “hey wait, I wanted a mutable field”, except by defining the field themselves...but even this is hard to do! @AutoValue allows you to define fields independently, but it will only call the no-argument constructor of your class, so there’s no way to initialize those fields except by relying on side effects. So, developers who really want a mutable field won’t be using @AutoValue, and won’t appear in the data I collected.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>However, we have static analysis tools that issue compiler warnings if you put an array or other mutable </span><span>object</span><span class="c2"> (e.g. collection) in an @AutoValue. So instead of looking at the current state of the codebase, I looked at data about how developers reacted to the static analysis warnings. I sampled 304 instances of warnings where someone felt strongly enough to point them out during code review: 272 of these actions were to say “this is a good warning, and you should fix your class”, and 32 were to say “This warning is not useful in this case.” I do not have data for developers who saw the warning and fixed it on their own before getting to code review.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>This warning is a relatively recent addition to our static analysis tooling, so there are some committed instances of @AutoValue classes with array fields from before that time, and additionally some cases where developers have reacted to the new warning by simply adding @SuppressWarnings to their existing @AutoValue class. </span><span class="c11">0.49%</span><span> of @AutoValue classes have an array member, and </span><span class="c11">0.06%</span><span class="c2"> of @AutoValueclasses contain a @SuppressWarnings annotation for this warning.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>So, broadly it seems that developers agree that </span><span class="c11">it’s better for records to be deeply immutable</span><span class="c2">, but a small percentage of rebels yearn to mutate their data carriers, or at any rate don’t want to refactor their legacy code to hew closer to the semantics of records.</span></p><h2 class="c13" id="h.7wr58k4q004e"><span class="c9">Construction</span></h2><p class="c1"><span class="c2">When defining an @AutoValue, you don’t get a public constructor for free, the way a proposed record would. Instead, you get a private generated constructor for free, and must either define a static factory method that delegates to the constructor, or define an abstract class to act as a builder for you; in the latter case, @AutoValue implements the builder, but there is still a fair bit more code to write in defining the methods that the builder class should have.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>Half of @AutoValue classes do the simplest thing: they define a single static factory which delegates to the generated constructor. 10% define two different factories. These groups, totaling 60% of @AutoValue classes, map well onto records as currently proposed. However, </span><span class="c11">a third of @AutoValue classes think it’s worth the trouble to define a builder</span><span> instead of, or in addition to, the static factory, even though they have to write a bunch more code to support it. This is an area where records could serve developers’ needs better, by offering some kind of opt-in support for </span><span class="c11">generating a builder to go with your data carrier</span><span class="c2">.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>But </span><span class="c12">why</span><span class="c2"> do people want a builder? How do they use it? Perhaps what they really want is named arguments, or default arguments. A builder may just be the best @AutoValue can do with the language as-is, but a new feature can try something bolder. In some use cases I see, every use of the builder sets every field explicitly, so the main advantage of a builder is that the field names are associated with their values at the construction call site. Such classes would probably be happy with a constructor with named arguments.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c2">On the other hand, I also see use cases where the builder is used to avoid specifying values for Optional<T> fields. Consider:</span></p><p class="c1"><span class="c3">public record Response(Optional<ServiceId> provider,</span></p><p class="c1"><span class="c3"> Optional<ResponseType> responseType,</span></p><p class="c1"><span class="c3"> Optional<Action> action,</span></p><p class="c1"><span class="c3"> Optional<String> referenceUrl) {</span></p><p class="c1"><span class="c3"> // Empty. This is a very bland record.</span></p><p class="c1"><span class="c3">}</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3">// ...</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3">public Response respond(Action action) {</span></p><p class="c1"><span class="c3"> return new Response(Optional.empty(), Optional.of(ResponseType.ACTION), Optional.of(action), Optional.empty());</span></p><p class="c1"><span class="c3">}</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3">// ...</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3">public Response redirect(String url) {</span></p><p class="c1"><span class="c3"> return new Response(Optional.empty(), Optional.empty(), Optional.empty(), Optional.of(url));</span></p><p class="c1"><span class="c3">}</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>All the Optional wrappers have muddied up the call sites a lot, and the use of positional constructor parameters makes it hard to tell what is being specified in each usage. The @AutoValue version of this record uses a builder, and so replaces </span><span class="c15">redirect</span><span class="c2"> with:</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c3">public Response redirect(String url) {</span></p><p class="c1"><span class="c3"> return Response.builder().referenceUrl(url).build();</span></p><p class="c1"><span class="c3">}</span></p><h2 class="c13" id="h.wb0xdw9tkd7d"><span class="c9">“Modification”</span></h2><p class="c1"><span class="c2">Of course with records being immutable, you can’t modify an existing record. But is it common to ask for a “modified version” of a record, copying a subset of fields but changing others? An often-suggested feature for records is support for “wither methods”: methods like</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c3">MyRecord withFoo(int newFoo) {return new MyRecord(newFoo, this.bar);}</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c2">As it turns out, defining methods like these is not very common. 3% of @AutoValue classes C have at least one instance method returning C - probably not all of these are “wither” methods, but many of them are. This is a small enough percentage of classes that we could reasonably exclude this feature from records: “if you want it that badly, you can do it yourself”.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>@AutoValue supports another kind of “modification” that I expected to be more popular: toBuilder(). If your data carrier uses a builder for its construction, you can ask @AutoValue to generate a toBuilder() method, which converts an existing value into a builder, so that you can ask for a subset of fields to be changed before solidifying back down into an immutable value. But it turns out this feature is used very rarely: only 1.5% of @AutoValue classes </span><span class="c12">with builders</span><span class="c2"> use this feature, which is less than 1% of all @AutoValue classes. So even considering wither methods and toBuilder together, less than 5% of @AutoValue classes use this feature.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>Perhaps if records could define builders and withers for you automatically and with very little boilerplate, these features would be used more often, but they don’t seem to fill a need so common that developers feel compelled to write them by hand. I</span><span class="c11">t doesn’t seem like a high priority to support wither methods, or toBuilder(), even if support for builders is added</span><span class="c2">.</span></p><h2 class="c13" id="h.kvbproq3zpzu"><span class="c9">Hidden State</span></h2><p class="c1"><span>How will developers feel about the restriction that each field corresponds to a constructor parameter and a public accessor? Will they wish they could have some local state? We can look at two things in @AutoValue classes to identify developers who fit into these categories. First, they may define private fields which do not participate in @AutoValue generation. This turns out to be quite rare: </span><span class="c11">less than 1% of @AutoValue classes have such properties</span><span class="c2">. It makes sense to not support this, as hidden state both goes against the semantic goals of records and would go unused by most developers.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>However, there is a more restricted notion of private “state” that may be more suitable, and which @AutoValue supports directly: memoization of derived properties. Developers can tag any nullary method with @Memoize, and the generated @AutoValue class will cache the return value of that method in a private field.</span><span class="c11"> </span><span class="c2">This seems reasonably compatible with the semantic goals of records, and could be worth supporting if it is used regularly.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>However, despite being very easy to use, @Memoize is not very popular. Only </span><span class="c11">1.4% of @AutoValue classes memoize any properties</span><span>. The most obvious things to memoize are hashCode and toString, and those are indeed the two most-memoized methods, but in total it is still pretty rare. Of @AutoValue classes which memoize something, only </span><span class="c11">14%</span><span class="c2"> memoize these methods: most have some other derived property that they want to cache.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>So, while it might be nice to offer support for lazy/cached methods, </span><span class="c11">leaving it out will likely not have a significant impact on record adoption</span><span class="c2">. If lazy instance fields ever make it into the language, we can retrofit them into records at that time. If memoization support is included, it should cover all properties, not just Object overrides. </span></p><h2 class="c13" id="h.hhqlerpaz7gy"><span class="c9">Manually Written Methods</span></h2><p class="c1"><span class="c2">Both records and @AutoValue will automatically provide correct implementations of equals(Object) and hashCode(), as well as a reasonable toString(). How often do developers feel the need to override these methods?</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>toString(), it turns out, </span><span>is most</span><span> common by a landslide, but still rare: </span><span class="c11">3% of @AutoValue classes have a manual implementation of toString()</span><span class="c2">. Some examples:</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span class="c3">@AutoValue public abstract class Constraint {</span></p><p class="c1"><span class="c3"> // ...</span></p><p class="c1"><span class="c3"> @Override public final String toString() {</span></p><p class="c1"><span class="c3"> return String.format(</span></p><p class="c1"><span class="c3"> "Constraint_%s_%s_%s_%s_%s",</span></p><p class="c1"><span class="c3"> cluster().name(), machine().name(), </span></p><p class="c1"><span class="c3"> machineIntent(), subinterval(), constraint());</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1"><span class="c3">}</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3">@AutoValue public abstract class SensitiveString {</span></p><p class="c1"><span class="c3"> public abstract String getValue();</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3"> public static SensitiveString of(String value) {</span></p><p class="c1"><span class="c3"> return new AutoValue_SensitiveString(value);</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1 c8"><span class="c3"></span></p><p class="c1"><span class="c3"> // Prevents sensitive strings accidentally being rendered.</span></p><p class="c1"><span class="c3"> @Override public final String toString() {</span></p><p class="c1"><span class="c3"> return "*";</span></p><p class="c1"><span class="c3"> }</span></p><p class="c1"><span class="c3">}</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>equals(Object) and hashCode() are only overridden around 0.5% of the time. </span><span class="c11">Developers are generally happy with auto-generated value semantics for their simple data carriers</span><span>. I looked at some of the overriding implementations of these methods - they often just wanted a hashCode that was faster, at the expense of having more collisions. In one case I found, one of the fields being wrapped was of a class with an incorrect hashCode implementation, and so the @AutoValue author hashed it externally. To allow workarounds like this, </span><span class="c11">allowing overrides is a good idea</span><span class="c2">, but we can expect it to be used rarely if the automatic implementations of Object methods are generally suitable.</span></p><p class="c1 c8"><span class="c2"></span></p><p class="c1"><span>In addition to overriding Object methods, there are other method signatures that crop up multiple times. Most common, although still less common than toString(), are conversion functions like toJson, toProto(), or toBuilder() (see Construction section). Much more rare, at around 0.1%, are methods like iterator() and size(): some @AutoValue classes wrap a single ImmutableCollection of some kind, and implement methods that delegate to this field. This </span><span class="c12">could</span><span class="c2"> be an argument in favor of the recent method-delegation proposal, but it is a pretty rare thing to do, and many of these cases are really not a great idea: they should just call foo.coll().iterator() instead of foo.iterator(), and having Foo implement Iterable brings relatively little benefit.</span></p><h1 class="c7" id="h.qt1c0ifqq0m0"><span class="c17">Footnotes</span></h1><h2 class="c13" id="h.cbm9jbf0ob8o"><span class="c9">Google’s Codebase</span></h2><p class="c1"><span>A</span><span> brief reminder about the value of using Google’s codebase to answer questions like these. Our codebase is large, easy to analyze, and highly cultivated, through static-analysis tools, enforced code-review, etc. In some ways, it does represent “what good Java code looks like”, but it also has some peculiarities, such as a weird fascination with protobufs. So, keep in mind that when I make claims about how code looks, I am talking specifically about Google’s codebase, and not about all Java code in the universe.</span></p><div><p class="c1 c8"><span class="c2"></span></p></div></body></html>