EpsilonGC and throughput.

Kirk Pepperdine kirk at kodewerk.com
Thu Dec 21 10:06:51 UTC 2017


> In such cases user saying "don't you touch anything, I'll do it myself" might be a better option.
> And users implement similar approaches today with flyweight objects encoding over byte[] arrays,
> argh! You've got to see it to believe it.

In fact it’s a recommended coding practice if you need to ensure heftier cache line densities. They typically run at an anemic 8-22% and encoding into arrays with fly-weights and pump that up to approach 100%. Lets not discuss pre-fetching improvements over the typical pointer chasing that happens as the collector scramples your heap on each copy phase.
>> It might just be better and not much more inconvenience to just not use
>> Java in the first place for them without other real good reasons (Imho)
> This is an old argument, but people do use Java in mechanical-sympathetic cases where investing in
> massaging the code to work right in JVM is better tradeoff than rewriting it out of JVM completely.
> What is amusing is that those uses are very high-profile and are substantial contributors of "Java
> is vibrant and everywhere" world view. We have to appreciate that, even though sometimes we
> deliberately make things harder for them (see e.g: Unsafe compartmentalization).

+1… there are many many reasons to use Java aside from the technical aspects of that decision and we shouldn’t be discouraging that but forcibly cutting off paths to get useful work done. Unsafe is a classic case of a garbage can class of parts that serve a definite need but there is some mindset that actively works against that need and thus we don’t get proper support for these much needed features.

I think that one of the assumptions here is that there is an even and perfect distribution of knowledge of garbage collection. There is also another assumption that even if that there was a perfect and even distribution of knowledge of GC that selecting the correct collector and tuning it is easy. I can say from experience that neither of these assumptions are even close to being true. Consider the first white paper on how to tune CMS that was published by Sun. This is a paper that you cannot find on the web any where anymore because it was completely wrong. And this was written by one of the best GC experts on the planet. And that is not the only example I can cite. How do you expect mere mortal developers to cope with the experts consistently get it wrong.

>> I do not understand the last sentence, sorry. And that 0.1% of use
>> cases is a number you just invented. I think a few months ago I
>> actually tried to quantify these number of users with you, with no good
>> answer.
> Well, yeah, 0.1% is invented for the sake of example about the Java ecosystem. My actual go-to pun
> is: "With the extraordinary size of Java ecosystem, epsilon-neighborhood of zero applicability
> contains non-zero users”.

I can help here by confusing this with more estimates. IME, those in the low latency space preferred to use iCMS. Now that iCMS is gone, they’ve switched to using some very unconventional configuration with CMS. The vast majority of those people that attend my workshop who don’t have low latency requirements have *never* heard of iCMS. Additionally, (again from my observation), Oracle typically doesn’t reach the vast majority of those in the low latency space thus the results of the informal survey to look at iCMS usage almost completely missed on an entire group of applications that were very much tied to using iCMS. If I scan the GC logs that I’ve collected over the years I can safely say that about 1-2% of them are iCMS logs of which the majority came from the low latency space. I will claim that those that may have the most interest in this collector come from that space. The advantage of this collector over serial or parallel is that there will be *no* GC pauses…. guaranteed. That space does not like stalls of random duration happening at random times. Oh, I forgot about other down stream effects such as writing per data to disk inflating the pause times and/or GC triggering page recycling stalls at the OS level. No, GC is not completely responsible but it adds to the pressures which result in the phenomena being triggered more frequently.

All I can add is that life in the real world rarely resembles life in a benchmark.

Kind regards,

More information about the hotspot-gc-dev mailing list