EpsilonGC and throughput.

Thomas Schatzl thomas.schatzl at oracle.com
Wed Dec 20 21:25:49 UTC 2017


Hi,

  (please read my answers in full before answering :))

On Wed, 2017-12-20 at 20:05 +0100, Aleksey Shipilev wrote:
> On 12/20/2017 03:46 PM, Thomas Schatzl wrote:
> > > You would probably be okay with small inefficiencies within the
> > > class library, if you can control the bulk of your own data
> > > either by relying on particular classlib implementation, or
> > > winding up your own.
> > 
> > And e.g. Serial GC *by itself* has what particular dependency on
> > something in the OpenJDK classlib that makes that impossible?
> > (Maybe the java.lang.ref.reference stuff?)
> 
> This is not about Serial GC. Sergey's argument was that classlib
> allocations are outside of users' control, and thus locality there is
> out of users' control either. My counter-point is that some
> locality waste might be acceptable, as long as the bulk of the work
> is done by user locality-conscious code anyway.

Yes, but in this case there is no difference between using Epsilon and
any other GC. All of them benefit. Now Epsilon might benefit more than
others.

I do not know how much. Probably it won't make a lot of difference,
because the default evacuation order tends to improve locality (and for
collectors like Z/Shenandoah I think with move on read access it does
even more); so unless the user also manually lays out memory in heap
according to access, then these persons are really really desperate.

It might just be better and not much more inconvenience to just not use
Java in the first place for them without other real good reasons (Imho)

> > > Well, nobody claimed Epsilon is a silver bullet. Before you can
> > > reap any of its benefits, you
> > > have to get the footprint under control [*]. After that, you can
> > > start exploring exotic memory
> > > management techniques,
> > 
> > Can you explain to me how you can't do that with e.g. Serial GC? Is
> > the allocation code in Serial
> > that much different? Actually I think it should be almost the same.
> 
> Concentrating on allocation path misses the point.
> 
> The crucial point is that Epsilon *guarantees* the absence of GC,
> rather than relying on obscure tuning of current GCs. 

Please don't exaggerate here: none of the switches I showed in my
example are obscure or hard to understand. And they pretty much disable
all heuristics in mentioned collectors.

And it is very unlikely anybody is going to touch Serial in the future,
because like Epsilon it serves a very particular niche rather well. And
you know that Serial does not do *anything* outside of pauses by
design.

[...]
> 
> > It won't be called "EpsilonGC" though, and won't have an extra
> > switch, but benefit openjdk probably even more.
> 
> See, this is the guarantee thing again. Having the extra
> configuration to mimic what Epsilon does in
> existing GC might be a way out, until you silently regress it via the
> interaction with some other GC option, some other bugfix, or some
> other performance improvement, or because GC developers in their
> wisdom changed the behavior ever so slightly. Having the GC that does
> not collect _by design_ makes it hard to compromise this property.
> 
> Suppose you find the configuration that prevents GC in existing
> Serial code. Asserting the needed behavior in current GC would mean
> developing white- or black-box style tests that assert the
> configuration setting works as expected, and that also has to be
> revisited every time some potentially-interacting GC feature / option
> is added. That is again, because Serial *might* collect,
> and you just *hope* you got the config right so that GC does (not)
> happen when you do (not) need it.
> 
> This is about having the guarantees by design, instead of being
> hopeful about the configuration.
> Epsilon makes an allocation failure the hard error, no excuses, no
> misconfiguration opportunities.

This guarantee is nice to have (and is trivially checked by a test to
avoid accidentally removing this behavior), but the JEP needs to spell
out why this guarantee is important in particular. I.e. what makes it
interesting, what can a be achieved *beyond* what can already be done
in (not too inconvenient) other ways by particularly exploiting the
guarantee that makes it so unique and useful.

Exaggerating: Establishing a random guarantee does not help a lot.

> > To me personally the best argument that is given in the JEP seems
> > to be that it helps validating the GC interface - but all other GCs
> > implementing it also do that already to some degree (serial,
> > parallel, not parallelold, cms, g1, probably Shenandoah, and Z).
> 
> The key thing is "personally to you" -- and that is fine. It does not
> mean other uses are wrong, because you don't need them, or the expert
> can configure other GCs to do (barely) the similar (but not exactly
> the same) thing.

I am fine to agree to disagree :) But in this case I kept talking to
you because I wanted to understand the reasons for this change and why
it's so useful in production because I knew there were some somewhere -
just not in the JEP. That's why we are annoying you with "tell me what
is the use of that change and what makes it so special" all the time.

Sorry, the JEP just does not answer these questions for me at the
moment. And apparently not for other people you did not talk to in
person.

> > > and no-op GC is one of many tools in the toolbelt there. What
> > > makes  Epsilon different from other tools is that it requires VM-
> > > side implementation -- and this is why it should be included into
> > > JVM.
> > 
> > The question is: do we need a new tool that only reinvents the old
> > ones with minimal (I would dare to say non-real world) advantages.
> 
> Yes, we do. An year ago, I thought this was a thought (pun intended)
> experiment, and I would probably have the same position -- just use 
> the myriad of GC options to configure the existing GC.
> But since then I had interesting talks with people who have use cases
> for the simple/trivial/dumb no-op GC: most of these things are 
> captured in JEP.

I just want to point out that people can't judge the change by what you
talked about with other people. I can only read the JEP, and criticize
accordingly. Not everyone has your/Kirk's/whoever's knowledge.

And the JEP, and I re-read the Motivation of the JEP multiple times
just now, does not spell out what is so unique and desireable (I do not
count random guarantees) about this change, or can't be done (almost)
the same with other GCs. What makes it so useful for production use, as
you were rambling on. I can only see a few improvements for devs, and
there are a few theoretically quantifiable claims made, that were not
quantified (not even attempted to). That's why we have been discussing
(apparently in circles) all the time, particularly about the latter.

Also, parts of the reasoning in the motivation, particularly the one
about performance, seem wrong or at least not clear (see the subject of
this email thread) or contrived. There are a few unmentioned, to me
right now, better alternatives than Epsilon (not necessarily
implemented in the VM).

> Java ecosystem is vast, and even 0.1% of use cases add up to 
> substantial absolute number of use cases. In the interesting twist  
> of fate, we are even considering backporting Epsilon to JDK 8, 
> because this is where the most current Java ecosystem
> is -- and having separate implementation does give nice isolation
> guarantees for backports.

I do not understand the last sentence, sorry. And that 0.1% of use
cases is a number you just invented. I think a few months ago I
actually tried to quantify these number of users with you, with no good
answer. 0.1% seems way too much because otherwise people would be
complaining *much* more loudly.

0.1% would be 1 in 1000. I again dare to say, that not 1 in 1000 VM
users care about "last drop performance"/"performance, functional,
interface testing"/"that odd guarantee" (partially because it's
negligible) instantly, at least not directly.

> Coming to from a personal perspective, Epsilon is like peat whiskey
> for me: first taste feels very wrong and you question the sanity of 
> those enjoying it, and then, as you become familiar with it,
> you realize it is just something else, in its essence, and you begin
> to see the appeal. It is not an everyday drink, for sure.

Thanks for the explanation - but I do not drink alcohol at all but your
explanation helped (well, just to support the following statement):
just as the JEP should contain such explanations...

> > > [*] In fact, it is also called out in JEP, the other way around:
> > > fail predictably when a lot is allocated. Over a few last months,
> > > I had a pleasant experience asserting allocation pressure
> > > invariants with just running with Epsilon with given heap and
> > > checking if it fails. When it does, I have the full heap-dump
> > > view of the garbage produced. This turns out to be much more
> > > convenient than I previously anticipated.
> > 
> > java "-XX:+UseSerialGC -Xmn<something> -Xms<something>
> > -Xmx<something>
> > -XX:SurvivorRatio=<something> -XX:+DumpHeapAtOome" (or something
> > like
> > this) myapplication
> > 
> > seems to give exactly the same information.
> 
> Nope, it does not. Because Serial would still attempt at least one GC
> when faced with potential OOME, and that will prune out the floating
> garbage -- and I am interested in *all* allocations. GC guys might
> argue that allocations are cheap, and that GC cycles pruning dead
> objects are also cheap, but the industrial reality is that people
> still hunt down and eliminate garbage allocations with non-ignorable
> performance improvements. The ability to heap dump with no object
> left behind is surprisingly useful. Again, some things are trivial in
> some GC designs. It is trivial to guarantee all allocated objects end
> up in heap dump with the no-op GC.

I did not think of that - and it would be really nice if paragraphs
like the one above would be in the JEP, as motivation and corresponding
explanation in the "Alternatives" section. You know, spelled out in
detail for people. Answering the question "why would I want this (in
production)".

This use case/explanation makes a way better case for the change,
particularly the "can't do that with other collectors" parts, actually
explains why you would want this (particularly) in production, than all
other reasons in the JEP combined.

For me that makes the point of the change much more clear. Thanks.

Please fix the JEP though. It's imho terrible and missing the point you
want to make (at least to persons you did not talk to; maybe you missed
even more).

Thanks,
  Thomas



More information about the hotspot-gc-dev mailing list