Thread stack size issue related to glibc TLS bug
fweimer at redhat.com
Fri May 24 10:13:21 UTC 2019
* David Holmes:
> My thoughts haven't really changed since 2015 - and sadly neither has
> there been any change in glibc in that time. Nor, to my recollection,
> have there been any other reported issues with this.
The issue gets occasionally reported by people who use small stacks with
large initial-exec TLS consumers (such as jemalloc). On the glibc side,
we aren't entirely sure what to do about this. We have recently tweaked
the stack size computation, so that in many cases, threads now receive
an additional page. This was necessary to work around a kernel/hardware
change where context switches started to push substantially more data on
the stack than before, and minimal stack sizes did not work anymore on
x86-64 (leading to ntpd crashing during startup, among other things).
The main concern is that for workloads with carefully tuned stack sizes,
revamping the stack size computation so that TLS is no longer
effectively allocated on the stack might result in address space
exhaustion. (This should only be a concern on 32-bit architectures.)
Even if we changed this today (or had changed it in 2015), it would take
a long time for the change to end up with end users, so it's unclear how
much help it would be.
Maybe OpenJDK can add a property specifying a stack size reserve, and
htis number is added to all stack size requests? This will at least
allow users to work around the issue locally.
If we change the accounting in glibc, we will have to add a similar
tunable on the glibc side, too.
More information about the core-libs-dev