EpsilonGC and throughput.

Aleksey Shipilev shade at redhat.com
Wed Dec 20 23:55:00 UTC 2017

On 12/20/2017 10:25 PM, Thomas Schatzl wrote:
>> This is not about Serial GC. Sergey's argument was that classlib
>> allocations are outside of users' control, and thus locality there is
>> out of users' control either. My counter-point is that some
>> locality waste might be acceptable, as long as the bulk of the work
>> is done by user locality-conscious code anyway.
> Yes, but in this case there is no difference between using Epsilon and
> any other GC. All of them benefit. Now Epsilon might benefit more than
> others.


> I do not know how much. Probably it won't make a lot of difference,
> because the default evacuation order tends to improve locality (and for
> collectors like Z/Shenandoah I think with move on read access it does
> even more); so unless the user also manually lays out memory in heap
> according to access, then these persons are really really desperate.

See, assuming the GC lays out the objects in the order most beneficial to application is also kinda
wishful thinking: we don't even know whether it should be depth-first, or breadth-first, or
topological, or read-traversal, or something else. All of them seem right for particular classes of
applications. In fact, you can have GCs messing up our nice and tidy object layout, see e.g.

In such cases user saying "don't you touch anything, I'll do it myself" might be a better option.
And users implement similar approaches today with flyweight objects encoding over byte[] arrays,
argh! You've got to see it to believe it.

> It might just be better and not much more inconvenience to just not use
> Java in the first place for them without other real good reasons (Imho)

This is an old argument, but people do use Java in mechanical-sympathetic cases where investing in
massaging the code to work right in JVM is better tradeoff than rewriting it out of JVM completely.
What is amusing is that those uses are very high-profile and are substantial contributors of "Java
is vibrant and everywhere" world view. We have to appreciate that, even though sometimes we
deliberately make things harder for them (see e.g: Unsafe compartmentalization).

>> The crucial point is that Epsilon *guarantees* the absence of GC,
>> rather than relying on obscure tuning of current GCs. 
> Please don't exaggerate here: none of the switches I showed in my
> example are obscure or hard to understand. And they pretty much disable
> all heuristics in mentioned collectors.

Please be aware of expertise trap: those options are obvious to you, knowing how OpenJDK collectors
work. It is odd to expect the same kind of expertize even from power users, who cannot really tell
if those particular Serial GC options give them what they want, is that the strong thing that is by
design, or that is the collateral implementation property, etc. That can probably get better by
capturing the intent with documentation, new options, etc, but then we run into... (next paragraph)

> And it is very unlikely anybody is going to touch Serial in the future,
> because like Epsilon it serves a very particular niche rather well. And
> you know that Serial does not do *anything* outside of pauses by
> design.

Aha! So, if we need something changed in Serial to implement Epsilon-like feature, that would run
into more resistance than having a separate implementation, right? At least I would be much more
wary, because Serial is quite extensively used. Small code duplication seems much less of the
concern than regression in widely used GC, at least from my vantage point.

>> This is about having the guarantees by design, instead of being
>> hopeful about the configuration.
>> Epsilon makes an allocation failure the hard error, no excuses, no
>> misconfiguration opportunities.
> This guarantee is nice to have (and is trivially checked by a test to
> avoid accidentally removing this behavior), but the JEP needs to spell
> out why this guarantee is important in particular. I.e. what makes it
> interesting, what can a be achieved *beyond* what can already be done
> in (not too inconvenient) other ways by particularly exploiting the
> guarantee that makes it so unique and useful.
> Exaggerating: Establishing a random guarantee does not help a lot.

Yeah, JEP mentions "low-overhead" as the replacement for "no GC", which is confusing. The goal was
to avoid any memory reclamation work, as clearly stated in JEP goals. Guaranteeing no GC is
therefore pretty much the project goal -- not as random as the exaggeration seems to paint it -- and
all the uses naturally evolve from that.

> Sorry, the JEP just does not answer these questions for me at the
> moment. And apparently not for other people you did not talk to in
> person.

Excellent, noted! Let me come up with better JEP text.

> And the JEP, and I re-read the Motivation of the JEP multiple times
> just now, does not spell out what is so unique and desireable (I do not
> count random guarantees) about this change, or can't be done (almost)
> the same with other GCs. 

So this is where it seems to go off the rail: we need to emphasize "no GC" is the actual guarantee,
so that readers could not disregard that goal as "random".

> Also, parts of the reasoning in the motivation, particularly the one
> about performance, seem wrong or at least not clear (see the subject of
> this email thread) or contrived. There are a few unmentioned, to me
> right now, better alternatives than Epsilon (not necessarily
> implemented in the VM).

What are they? Are we missing some points from "Alternatives" here?

>> Java ecosystem is vast, and even 0.1% of use cases add up to 
>> substantial absolute number of use cases. In the interesting twist  
>> of fate, we are even considering backporting Epsilon to JDK 8, 
>> because this is where the most current Java ecosystem
>> is -- and having separate implementation does give nice isolation
>> guarantees for backports.
> I do not understand the last sentence, sorry. 

You said yourself: "It is very unlikely anybody is going to touch Serial in the future", and I
agree. Touching Serial GC, especially in backports, for implementing Epsilon-like functionality is a
greater risk than having a completely separate no-op GC implementation. I did the trial backport of
Epsilon to 8u, and fits without scary changes to the rest of Hotspot.

> I do not understand the last sentence, sorry. And that 0.1% of use
> cases is a number you just invented. I think a few months ago I
> actually tried to quantify these number of users with you, with no good
> answer.

Well, yeah, 0.1% is invented for the sake of example about the Java ecosystem. My actual go-to pun
is: "With the extraordinary size of Java ecosystem, epsilon-neighborhood of zero applicability
contains non-zero users".

> This use case/explanation makes a way better case for the change,
> particularly the "can't do that with other collectors" parts, actually
> explains why you would want this (particularly) in production, than all
> other reasons in the JEP combined.

Yeah, I did not realize this was the strong suit at the time JEP was drafted. As I said before, you
sometimes find some facets are actually more useful than the others.


Process comments below:

> [JEP] is imho terrible and missing the point you want to make (at least to persons you did not
> talk to; maybe you missed even more).

> What makes it so useful for production use, as you were rambling on. 

"rambling": adj.
 1. (of writing or speech) lengthy and confused or inconsequential.

> I can only see a few improvements for devs, and there are a few theoretically quantifiable claims
> made, that were not quantified (not even attempted to).
Hm. Is there a hard requirement to quantify everything stated in JEP? Because I did quantifications,
for locality, barrier costs, startup improvements after the JEP was submitted. Again, you might just
kindly ask to link the data into the JEP, instead of assuming the submitter is lazy. (Which is also
weird, because Sergey links *my* post about *Epsilon* in the beginning of this thread)

> I am fine to agree to disagree :) But in this case I kept talking to you because I wanted to
> understand the reasons for this change and why it's so useful in production because I knew there
> were some somewhere - just not in the JEP. That's why we are annoying you with "tell me what is
> the use of that change and what makes it so special" all the time.
I understand. I do note, however, that in the spirit of ongoing collaboration, saying "I see
disadvantages A, B, C, and advantages G, H, I, on top of what is written in JEP, and the goal, if I
understand the intent right, should be N, not M?" is quite different from asking "tell me what makes
it so special".

> I just want to point out that people can't judge the change by what you talked about with other 
> people. I can only read the JEP, and criticize accordingly. Not everyone has 
> your/Kirk's/whoever's knowledge.
Totally. And I have to point out that JEP is neither the code review, nor architectural review, nor
the paper to review, nor something cast in stone. As I understand it, it is supposed to capture the
key points of the idea, and get collaboratively refined to contrast the salient points and
disadvantages. It is expected that experts may have more ideas about refinements *above* what
original submitter meant to write, have suggestions for refining some points, etc. Instead what we
get is "you defend"-style "collaboration" -- which, if continued, will attract much fewer
contributors than OpenJDK really needs.

So, with all above, please excuse me if I get all defensive.

I would really appreciate if the discussions around JEPs were not in the spirit of "prove to us why
we have to consider your terrible ramblings", but rather "let us refine the JEP to clearly highlight
the benefits and disadvantages for OpenJDK". Which seems to finally happen for this JEP, and I am
happy about that.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20171221/94dc23c7/signature.asc>

More information about the hotspot-gc-dev mailing list