deduplicating lambda methods
brian.goetz at oracle.com
Sat Mar 3 01:17:19 UTC 2018
I think this is a great idea. And it synergizes well with some other
work we've got in the pipe.
It's a shame that if you use the same lambda twice in the same source
file, we desugar two separate lambda$nnn methods, and spin two separate
lambda proxy classes. Deduplicating the lambda$nnn methods will address
the former; a separate effort, where we are using the new
"constantdynamic" instead of "invokedynamic" to evaluate method refs and
non-capturing lambdas, will address the latter (once the former is
addressed.) So this will make Java programs more efficient overall.
(The condy translation will get us deduplication for free for method
references, but not for lambdas.)
I think AST comparison is likely to be easier and more effective. And it
doesn't have to be perfect; if it gets fooled by occasional differences,
that's OK, as long as it doesn't merge lambdas that are actually
different. And you don't have the generated code until later, when its
likely harder to do the merging. There's a whole pass in the compiler
pipeline for lambda method desugaring (LambdaToMethod), so there's an
obvious place to do this transformation.
Another consideration is serializable lambdas; the scheme for
serializable lambdas involves a parallel generation path for
deserialization. I suspect that its probably best to just avoid
serializable lambdas entirely, at least at first.
Is there any trickiness with capturing lambdas? I don't think so -- I
think we can merge these too, although I suspect the return on that
effort is lower. I'll bet the most common case is lambdas like e -> e,
x -> System.out.println(x), etc.
I know you have some good tools at Google for codebase statistics.
Maybe you could pull together data on how often lambdas are duplicated
within a source file, and of the duplicated lambdas, what percentage are
stateless and non-serializable?
On 3/2/2018 8:03 PM, Liam Miller-Cushon wrote:
> I'm interested in adding support for deduplicating lambda methods to
> javac. The idea is that if a compilation unit contains two lambdas
> that are identical (including any captured state and the functional
> interface they implement) we could re-use the same implementation
> method for both.
> I understand there might have been some prior discussion about this.
> Is there interest in investigating the feature? What sort of technical
> considerations have been identified so far?
> I have been thinking about a couple of questions:
> 1) How to identifying duplicates: I have a prototype that runs during
> lambda desugaring and identifies duplicates by diffing ASTs. Is that
> the best place for deduplication, or it worth considering comparing
> generated code instead of ASTs?
> 2) Debug info: the optimization is safe if line numbers are not being
> emitted. If they are, is there a way to deduplicate the methods
> without breaking debug info?
More information about the amber-dev